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Abstract 


This  dissenauon  is  a  study  in  depth  of  a  method,  called  Hierarchical  Process  Composition  (HPC),  for 
organizing,  developing,  and  maintaining  large  distributed  programs.  HPC  extends  the  process  abstraction  to  nested 
collections  of  processes,  allowing  a  muluprocess  program  in  place  of  any  single  process,  and  provides  a  nch  set  of 
struciiiring  mechanisms  for  building  distributed  applications.  The  emphasis  in  HPC  is  on  structural  and 
architectural  issues  in  distributed  software  systems,  especially  interactions  involving  dynamic  reconfiguration, 
protection,  and  distribution.  The  major  contributions  of  this  work  come  from  the  deiailc  '  based  on 

case  studies,  formal  analysis,  and  a  prototype  implementation,  of  how  abstraction  and  composition  interact  ir- 
unexpected  ways  with  each  other  and  with  a  distributed  environmcnu 

HPC  ties  processes  together  with  heterogenous  interprocess  communication  mechanisms,  such  as  TCP/IP  and 
remote  procedure  call.  Explicit  structure  determines  the  logical  connectivity  between  processes,  masking 
differences  in  communication  mechanisms.  HPC  supports  one-to-one,  parallel  channel,  and  many-to-many 
(multicasting  )  connectivity.  Efficient  compulation  of  end-to-end  connectivity  from  the  communication  structure  is  a 
challenging  problem,  and  a  ihird-pany  connection  facility  is  needed  to  implement  dynamic  reconfiguration  when  the 
logical  conneciivity  changc.s 

Explicit  structure  also  supports  grouping  and  nesting  of  processes.  Hf*C.  uses  this  process  structure  to  define 
meaningful  protecuon  domains.  Access  control  is  structured  (and  the  basic  HPC  facilities  may  be  extended)  using 
the  same  powerful  tools  used  to  define  communication  patterns.  HPC  provides  escapes  from  the  strict  hierarchy  for 
ducct  communication  between  any  two  programs,  enabling  transparent  access  to  global  services.  These  escapes  arc 
carefully  controlled  to  prevent  interference  and  to  preserve  the  appearance  of  a  strict  hierarchy. 

This  work  is  also  a  rare  case  study  in  consistency  control  for  non-trivial,  highly-availablc  services  in  a 
distributed  environment.  Smcc  HPC  abstraction  and  composition  operations  must  be  available  during  network 
partiuons.  basic  structural  constraints  can  be  violated  when  separate  partitions  arc  merged.  By  exhaustive  case 
analysis,  all  possible  merge  inconsistencies  that  could  arise  in  HPC  have  been  identified  and  it  is  shown  how  each 
inconsistency  can  be  cither  avoided,  automatically  reconciled  by  the  system,  or  reported  to  the  user  for  application- 
specific  reconciliation. 
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Introduction 


1.  Introduction 

A  thesis  with  a  ‘ong  litie  such  as  Hierarchical  Process  Compositiofi  and  ihe  DyncTdc  Maintenance  of 
Struciurt  in  the  D'sinbuied  Environment  is  cither  very  wie-’  ranging  or  narrowly  focused.,  iliis  thesis  is  a  study  in 
depth,  rather  than  breadth,  of  one  method  fo:  organizing  large,  distributed  progrems.  The  emphasis  is  on  structural 
or  architectural  issues  in  distributed  software,  especially  interactions  involving  change,  protection,  and  distribution. 
The  major  contribution  or  novelty  of  this  work  is  found  not  in  the  organization  method,  but  b  the  detailed 
consideration  of  how  its  features  Interact  with  each  other  and  wiih  the  environment 

This  ir.iroducDon  addresses  the  questions  "what",  "how",  and  "what’s  diffcrcri".  Section  1.1  discusses  the 
kind  of  programs  under  consideration,  the  environment  in  which  they  run,  and  what  we  v*'a.ni  to  do  with  them. 
Section  1.2  describes  the  basic  features  of  the  organizational  method  (HPC),  and  Seciicn  1.3  describes  how  this 
ncthoo  differs  from  a  variety  or'rcl«-ted  systems. 

1.1.  The  Problem  Area 

This  thesis  studies  the  structural  implications  of  one  method  for  structuring  large,  distributed  programs.  As 
neither  the  method  nor  the  problem  it  addresses  arc  relevant  for  all  programs,  here  we  describe  the  computational 
environment,  the  ly-picaJ  program,  and  the  type  of  operations  on  programs  we  arc  addressing. 

1.1.1.  The  Structure  of  Target  Applications 

The  number  of  large,  distributed  applications  is  gradually  increasing  while  the  set  of  tools  for  structuring  and 
managing  them  is  not.  Here  arc  selected,  real  applications  and  some  of  their  relevant  structural  properties. 

Complex  production  automation  software  is  becoming  common  in  industrial  plants  [Dou84],  (Ran83], 
[FHH87].  It  is  characterized  by  a  hierarchical  structure  with  well  dehned  communication  patterns  between  nodes, 
great  heterogeneity’  of  computer-controlled  processing  and  handling  stations,  and  a  significant  degree  of  dynamic 
reprogramming  of  stations  on  the  fly.  There  is  also  less  frequent  reconfiguration  to  add  and  remove  stations  and  in 
niodify  their  groupings;  during  a  rcconfigurador.  most  of  the  plant  is  in  continuous  operation.  The  upper  levels  of 
the  production  automation  hierarchy  interaci  with  independently  maintained  and  administered  suites  of  management 
and  engineering  software. 

Many  of  these  characteristics  are  present,  but  less  pronounced,  in  process  control  software  for  industrial 
applications  and  scienific  instrumentation.  For  example.  GPIB  instruments  are  remotely  programmed  for  each 
experiment  or  lest  within  an  experiment  [IEE75].  (1joA78].  lMuJ78].  while  space  probes  may  te  itprognnuned  in 
flight  once  or  twice  in  a  year  [LMW86].  A  scientific  experiment  might  be  erganized  into  a  smal!  hierarchy  with 
several  instruments  controlled  by  laboratory  minicomputers  at  the  bottom,  and  the  lab  minis,  a  daiab?.se  tnaclune,  a 
numeric  procz>'X)r  and  a  display  workstation  at  the  next  level. 

Collaborative  network  services  arc  another  class  of  d;jiribuied  application.  \Vhai  a  elicit  perreives  as  z 
unified  logical  sc;vicc  may  be  implemented  as  a  dynamically  varying  coHeciion  of  j-iccr  savers.  Tnc  DARPA 
Internet  domain  name  service  is  a  well  developed  example  of  a  collaVirativc  scrrice  (MocSTbl,  [Moc87t).  The 
actual  servers  arc  independently  administered,  and  none  of  them  provides  complete  service.  Ir^stead  they  provide 
service  only  for  pieces  of  the  domain  name  space  and  referrals  to  ether  servers  with  adjacent  pans  of  the  name 
space.  The  domain  name  service  can  be  dynamically  reconfigured  to  change  the  paniiion  of  the  n?.mc  spare  anicng 
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ser\'ers,  and  to  change  the  degree  of  replication  of  given  servers,  witliout  affecting  the  service  provided,  as  each 
server  is  managed  autonomously,  configuration  is  a  cooperative  and  incremental  process.  The  Xero^  Clearinghouse 
and  Grapevine  ser\'ices  are  ether,  earlier,  examples  of  this  kind  of  collaboration  {S3N83]. 

A  significant  class  of  distributed  application  emphasizes  robujiness  and  ibc  ability  to  survive  failures  by 
recognizing  failures  and  taking  corrective  action.  Such  actions  often  change  the  operations  of  fh'  ap'lication.  This 
is  a  more  general  approach  than  convenuonal  fault-tolerance  in  which  failures  ire  masked  and  ide  ly  have  ?io  effect 
on  the  application.  A  classic  robust  application  is  distributed  network  routing  as  implcmerted  he  ARPANET 
[MFR78],  [MRR80].  The  individual  packet  switching  nodes  collaborate  to  recompute  the  best  rout  *  om  one  node 
to  another  as  nodes  and  links  fail  and  recover,  and  as  links  become  more  or  less  congested,  vricre  we  are 
considering  the  internal  routing  algorithm  as  the  application,  rather  than  the.  service  provided  by  the  ARPANET.) 
This  cximolc  differs  from  the  domain  name  service  by  being  centrally  adr  inisiercd,  but  shares  the  property  of 
having  long-lived  and  clearly  defined  communication  patterns  Utweeii  cooperating  peers. 

Work  in  distributed  problem  solving  has  stimulated  a  wide  raiigc  of  relevant  program  structures^  Contract 
net  systems  arc  a  good  example,  having  been  used  for  distributed  tracking  of  vehicles  within  a  geographic  area 
[Smi78],  .solving  heuristic  search  problems  [SmiSO],  and  factory-  automation  fShW88].  A  contract  net  system  is  a 
dynamically  self-organizing  program  for  allocating  work  to  a  set  of  processing  nodes.  A  node  breaks  a  complex 
task  into  several  subtasks  that  can  be  processed  concurrently,  and  requests  (usually  all)  other  nodes  to  bid  for  a 
contract  on  a  subtask.  The  requesting  node  cvalu?»es  the  bids  received  and  wards  one  or  more  contracts.  A 
contract  net  is  the  graph  of  contracts  that  specify  how  the  resporisibilitv  for  completing  a  lop-lcvrl  task  has  been 
broken  up  and  distributed  among  nodes.  Contract  net  systems  smootl^ly  adapt  to  varying  k>^ds,  automatically 
migrating  new  proce.ssing  to  the  nodes  with  idle  capacity-.  They  organize  cooperative  activity  among  autonomous 
nodes.  Robustness  is  provided  by-  periodically  reissuing  reques'^s  for  bids  if  a  task  is  uncompleted  for  whatever 
reason. 

There  arc  also  obvious  military  appiicauons  under  the  broad  heading  of  command  and  control  systems.  A 
combination  of  satellites,  airborne  platforms,  microw-avc  links,  and  ground  mobile  packet  radio  networks,  operating 
under  conditions  Jiai  encourage  frequent  loss  and  reconfiguration,  provide  the  data  communication  layer  for 
military  comm.and  and  ccntrcl  activities.  Applications,  as  well  as  the  underlying  communication  networks,  must 
support  reconfiguration  and  continuous  operation. 

To  summarize,  the  ty  >'ical  application  under  consideration  has  most  or  all  of  the  following  characteristics: 

•  It  displays  a  hierarchical  structure  where  a  furictional  unit  u!  one  level  is  implemented  as  a  collection  of 
cooperatiiig  units  at  the  next  level  down. 

•  It  .has  long-Uvcd,  well-defined  communication  patterns  defining  the  interactions  between  the  siblings  at  a 
given  level. 

•  Its  components  aic  loosely  coupled,  able  to  do  s. '.lificant  work  without  an  immediate  response  from  a 
neighbor.  They  arc  often  organized  as  functional  peers,  rather  thin  master/slave  or  client/server. 

'  7)  cxceUmt  imroductory  nirvry  is  fDecS7j 
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•  Us  components  represent  active  compuiationa]  elements,  like  processes  or  tasks,  rather  than  passive  objects, 
like  code  modules  or  data  files. 

•  Pans  of  it  are  managed  autonomously  so  that  both  computation  and  administration  arc  distributed. 

•  It  is  robust  and  adaptive  to  the  changing  conditions  of  a  distributed  environment. 

L1.2.  Dynamic  Maintenance  of  Structure 

Distributed  applicauons  emphasize  change.  Adaptive  programs  are  expected  to  change  not  only  Lhcir 
behavior,  but  their  internal  structure,  in  response  to  new  demands  and  environmental  conditions.  A  long-lived 
application  may  be  expected  to  run  continuously  for  longer  than  any  given  host  machine  or  software  version  will 
survive.  Failure,  migration,  reconfiguration,  and  changing  requirements  all  may  force  changes  within  an 
application. 

The  interactions  between  applications  arc  subject  to  change  as  well.  Stable  distributed  services  usually  have 
dynamically  changing  clients.  In  a  complete  distributed  processing  system,  complex  multiproccss  programs  arc 
manipulated  even  at  the  highest  level,  where  entire  applications  (i.c.,  jobs)  arc  introduced  and  removed  over  the 
lifetime  of  the  system 

This  emphasis  on  change  is  a  tVpanurc  from  the  conventional  environment,  where  the  pieces  of  an 
application  a.*d  their  relationships  are  specified  statically  and  relatively  easily.  Controlling  and  constraining  change 
is  a  majo;  technical  challenge  that  confronts  distributed  programming.  A  static  (or  compiled)  description  of  an 
application's  structure,  its  distribution  across  host  machines,  and  its  interactions  with  other  applications  is 
insufficient  A  framework  for  structuring  large,  distributed  programs  must  also  provide  operations  that  modify 
application  structure  during  execution. 

•  In  general,  maintenance  encompasses  functions  such  as  replacement  of  failed  components,  compensation  for 
partitioning .  upgrading  components  to  more  recent  software,  and  reconfiguring  an  application  to  handle  more 
or  fewer  tasks. 

<»  The  combinauon  of  autonomously  administered  components  and  dynamic  change  requires  runtime  access 
control  to  ensure  that  only  authorized  pieces  of  an  application  are  examined  or  modified.  The  same 
restrictions  are  necessary  to  enstare  that  different  applications  do  not  interfere  with  one  another. 

•  Performance  and  engineering  issues  dictaic  consideration  of  migratUtn  or  relocation  of  an  application's  pieces 
to  accomplish  load  balancing,  exploit  locality,  compensate  for  loss  or  gain  of  physical  resources,  and  so  fcsih. 

•  Every  complex  application  will  require  some  form  of  stmcainl  debaaing  to  tqpplemeiu  OQn¥encioaal 
debugging  of  individual  oompof^ts.  Debugging  nuiy  take  the  form  of  examining  the  ai)plicauon’s  current 
structure,  monitoring  the  communicaiion  between  components,  and  making  temporary  altenuions.  When 
bugs  are  found,  the  miinteri/nce  piocedures  allow  the  necessary  repairs. 

1.13.  The  Distributed  Eoviroomeut 

There  is  a  continuum  between  the  extremes  ol  centralized  and  distributed  computing  and  no  dear  boundary 
can  be  drawn  between  the  two.  Indeed,  much  work  has  been  invested  in  supporting  the  centralized  behavior  to 
which  programmers  (and  paying  customers)  arc  accustomed  using  ever  more  widdy  distributed  hardware.  Remoie 
procedure  calls,  nef^^erk  file  systems,  atomic  actions,  and  even  nctwofk-wide  shared  memory  arc  (not  always) 
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successful  aiiempis  to  mask  the  distribution  of  the  system,  or  provide  network  transparency. 

However,  we  are  interested  in  the  distributed  ex*reme  of  the  continuum  for  several  reasons. 

•  Exiremeiy  distributed  systems  have  several  clear,  intrinsic  -acteristics.  Primarily,  their  processors  (sites, 
hosts)  are  asynchronous,  and  subject  to  independent  failure,  T.iese  properties  have  significant  impact  on  the 
software  that  must  run  on  them. 

•  Distributed  systems  are  often  (but  not  intrinsically)  divided  into  autonomous  regions  for  administrative 
reasons,  and  composed  of  heterogenous  elements.  Many  systems  can  be  temporarily  partitioned  into 
subsystems  able  to  communicate  internally  but  isolated  from  one  another  by  failures. 

•  Beyond  a  certain  physical  size,  it  is  no  longer  reasonable,  even  if  possible,  to  mask  distribution  Instead, 
distribution  should  be  made  explicit  in  order  to  exploit  locality,  both  physically  and  functionally. 

•  There  are  definite  physical  and  engineering  limits  to  masking  distribution.  For  example,  the  speed  of  light  is 
already  a  significant  factor  in  the  latency  of  satellite  assisted  communication.  The  availability  of  services  that 
depend  on  simultaneous  access  to  all  copies  of  heavily  replicated  data  decreases  rapidly  with  increase  in 
replication.  Softw'are  that  accounts  for  distrihuv  on  explicitly  may  scale,  while  systems  that  depend  on  a 
centralized  environment  will  not. 

•  Extremely  distributed  systems  can  not  make  the  closed  world  assumption  common  to  centralized  systems, 
where  all  the  interacting  pieces  (programs,  modules,  applications,  client,  servers)  can  be  described  all  at  once 
and  in  one  place.  Instead,  they  must  assume  an  open  system,  allowing  new  pieces  to  be  added  to  the  existing 
framework, 

•  Distributed  systems  must  admit  dynamic  structure,  so  that  pieces  can  be  added  to  and  removed  from  the 
system  at  different  times  aiKl  places. 

The  ARPANET  and  SATT'JET  (JBH78]  wide-area  networks  and  their  attached  hosts  exemplify  distributed 
environments,  while  packet  radio  networks  (JuT87],  fKGB78)  and  the  NASA  Deep  Space  Network  fYue83] 
reprcccni  some  extreme  cases.  However,  the  characteristics  at  the  end  uf  the  continuum  describe  distribution 
independently  of  hardware  issues  like  relative  speed,  geography,  end  cost  The  blackboard  and  contract  net 
program  structures  used  in  some  artificial  intelligence  woric  yield  extremely  distributed  software  by  our  criteria, 
even  when  implemented  on  centralized  hardware.  Therefore,  we  will  treat  the  distributed  environment  as  a 
programming  environment,  no  matter  where  it  is  found,  rather  than  a  physical  environment. 

I.IA  Goals 

This  thesis  has  three,  general  goals: 

•  Develop  a  struauril  representation  for  target  applications. 

The  representatiori  must  be  adequate  to  describe  any  snapshot  of  a  target  application.  It  must  allow  for  the 
application  features  described  in  Section  1.1.1.  This  will  make  the  transition  from  structured  design  to  implemented 
application  direct,  and  therefore  fast  and  easy. 

•  Provide  operations  to  manipulate  structural  representations  during  execution. 

These  operations  must  provide  sufficient  mechanism  to  implement  the  dynamic  maintenance  features  of  Seaion 
1.1.2.  It  must  not  be  possible  to  create  illegal  representations  from  legal  ones  (soundricss).  and  it  should  be  possible 
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10  create  any  legal  represeniaiion  (complcicneis).  There  must  also  be  a  practical  method  for  implementing  the 
operauons  and  making  liiem  available  to  application  designers  and  managers. 

•  Identify  specific  environmental  influences  on  application  structure  and  management. 

Independent  failure,  asynchrony,  and  autonomy  have  pervasive  effects  on  the  organization  of  an  application.  These 
effects  will  be  reflected  in  many  exu-cmcly  disuibuted  applications,  whether  they  use  our  particular  representation  or 
not,  and  we  seek  to  identify  them.  In  the  context  of  our  representation,  the  environment  oftens  limit  our  ability  to 
express  or  guarantee  desirable  properties.  In  other  cases,  it  suggests  a  unification  and  simplification  of  several 
features. 

Many  interesting  topics  in  distributed  systems  have  been  deliberately  omitted  from  discussion.  This  thesis 
does  not  do  many  things.  It  docs  not  formally  define  processes  or  active  computation.  It  docs  not  develop  a  formal 
model  of  concurrency.  It  docs  not  provide  a  new  design  methodology.  If  docs  not  promote  new  programming 
language  concepts.  It  does  not  provide  a  performance  model.  It  docs  not  schedule  processes.  It  docs  not  develop 
new  communication  protocols  or  network  architectures.  It  docs  not  manage  resources.  It  docs  not  mask  failures.  It 
does  not  serialize  application  operations.  Most  of  these  issues  arc  independent  of  any  form  of  program  suuciuring 
and  represent  services  that  can  be  provided  by  host  facilities  beneath  the  system  to  be  dc.scribcd  or  by  utility 
applications  above  it. 

1.1.5.  Thesis  Outline 

The  remainder  of  this  Introduction  sketches  the  HPC  approach  to  stniauring  applications,  based  of  process 
abstraction  and  explicit  composition,  and  compares  it  to  related  work.  Chapter  2  introduces  three  of  the  four 
exploratory  themes  of  this  thesis:  protection  and  control  strucauc,  communication  structure,  and  non-hicrarchical 
structure,  and  illustrates  the  HPC  operations  for  run-time  rcconfiguradon.  The  imenctiofis  among  these  features 
and  between  Uiem  and  the  environment  are  noted  throughout  the  following  four  Chapten. 

The  HPC  protection  system  defines  what  an  agent  is  permitted  to  examine  or  change.  Chapter  3  stows  how 
we  exploit  rich  and  explicit  process  structure  to  define  meaningful  domains  of  protection,  and  how  control  is 
configured  using  the  same  powerful  tools  as  communication.  Some  major  benefits  from  this  unique  protection 
system  are  direct  association  of  protection  and  management,  arbitrary  user-defitied  access  control  policies,  and  a 
simple  mechanism  for  extending  or  modifying  the  built-in  HPC  system  facilities. 

In  Chapier  4.  we  focus  on  interproce.ss  crmmunicaiion  (IPC).  Starting  with  simple  orte-to-one 
communkadon  patterns.  ^CPC  incotpofites  multiple  parallel  channels  and  arbitrary  inany-to>many  pauems.  These 
complex  interactions  are  ail  expressed  structurally,  instead  of  using  addressing  or  ocher  properties  cf  specific  IPC 
mechanisms.  HPC  supports  heterogeneous  IPC  mechanisms  with  differing  behaviors,  while  presenting  a  single 
mechanism  for  the  configuration  of  communicating  processes.  This  prompts  a  division  of  communication  functions 
into  logical  configuration,  transpon  implemenucion.  and  aaual  communication. 

A  purely  functitmil  approach  to  composition  involving  strict  trees  and  explicit  composition  is  impractical.  In 
particular,  access  to  global  services  is  clumry  and  potentially  dangerous.  Chapter  5  demonstrates  the  dash  between 
transparent  abstractions  and  purdy  functional  composttions.  HPC  resolves  the  clash  by  allowing  direct  ncn-local 
communicauon  between  any  two  points  in  the  hierarchy,  while  preserving  the  appearance  of  a  sirkt  tree. 
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The  target  environment  is  subject  to  paniiion,  and  HPC  permits  highly  available  applications  that  continue  to 
run  with  reduced  resources  while  partitioned.  Because  process  structure  may  be  freely  modified  during  partition, 
inconsistencies  can  be  discovered  upon  merge.  The  fourth  exploratory  theme  is  the  reintegration  of  applications  that 
have  been  modified  inconsistcnty  during  partition.  Chapter  6  identifies  all  possible  HPC  merge  inconsistencies,  and 
shows  how  they  are  either  avoided,  automatically  reconciled  while  preserving  the  pre-merge  behavior,  or  reported 
to  the  user  with  tools  for  applicaiion-spiecific  reconciliation.  The  techniques  we  use  for  avoiding  inconsistencies  are 
not  specific  to  HPC.  and  provide  useful  lessons  for  building  other,  highly  available,  rcconfigurablc  systems. 

Chapter  7  reports  the  protoi>’pe  HPC  implementation  and  early  experiences  with  it 

Following  our  conclusions  in  Chapter  8,  we  present  a  formal  description  of  HPC  structure  and  the  operations 
on  it  ill  Appendix  A.  Soundness  and  completeness  results  are  related  to  HPC  structure  considered  as  a  formal 
system.  Based  on  our  experiences,  we  suggest  investigation  of  new  laws  of  distribution  and  composition  for  strictly 
hierarchical  formal  systems  such  as  CSP  and  CCS,  in  part  to  provide  sharing  that  those  systems  do  not  support. 

1.2.  Hierarchical  Process  Composition 

The  process  abstraction  has  been  used  successfully  as  a  tool  to  structure  complex  systems  since  the  late 
1960’s.  The  THE  (Dij68j  and  RC4000  [Bri69,Bri73]  operating  systems  with  their  layers  of  cooperating  processes 
are  important  early  examples  of  such  process  structuring  [HoR73].  When  considering  how  to  organize  a  target 
application,  consisting  of  loosely -coupled,  active  peer  elements  with  well-defined  communication  panems,  process 
structuring  should  come  to  mind  immediately  as  an  appropriate  choice. 

Brian  Randell  has  emphasized  the  additional  structuring  principles  of  abstraction  and  composition  in  his  work 
on  reliable  software. 

Thus  the  sons  of  strucruring  that  we  have  discussed  so  far  can  be  described  as  stnicturing  within  a  single  level  of 
abstraction,  or  horizomal  siructuring.  ...  In  choosing  to  identify  a  set  of  levels  of  abstraction  ...  and  to  define  their  in- 
tenrelauonships  one  is  once  again  imposing  a  structure  on  a  system,  but  this  is  a  rather  different  form  of  structure 
which  wc  wUl  refer  to  as  vertical  structuring.  Thus  vertical  structurings  describe  how  components  are  constructed, 
whereas  horizonia]  structurings  describe  how  components  interacL  [Ran791 

He  also  described  the  degree  to  which  the  logical  structure  of  an  application  intended  by  the  designer  is  supported 
and  enforced  by  the  underlying  system  as  the  degree  of  actual  as  opposed  to  conceptual  structure. 

Wc  arc  motivated  by  distribution  rather  than  reliability,  bat  the  concepts  of  process  structuring,  abstraction, 
and  composition  arc  the  basis  for  Hierarchical  Process  Composition  (HPC).  In  Randell 's  terminology,  wc  will  pul 
actual  structure  into  distributed  programs,  with  explicit  vertical  and  horizontal  structuring. 

Our  focus  is  on  process  structuring  and  how  complex  applications  can  be  built  from  smaller  ones,  and  not  on 
the  internal  behaviors  of  individual  processes.  For  this  reason,  the  definition  of  process  is  not  critica]  to  HPC.  Any 
favorite  definition  (finite  state  automata,  infinite  sequences  of  primitive  events,  or  state  vectors  plus  threads  of 
control)  may  be  used.  The  only  things  we  need  to  know  about  any  particular  process  are  its  name  hkS  the  interfaces 
where  it  may  interact  with  other  processes.  These  external  properties  define  the  process  abstraction. 

Horizontal  structure  defines  a  graph  of  processes  and  the  interactions  between  them.  Wc  often  call  the  paiicni 
of  communication  in  this  graph  the  composition  of  processes,  because  it  defines  the  behavior  of  the  overall  structure 
as  a  funcuon  of  the  behaviors  of  its  components.  Vertical  structure  groups  related  processes  together.  By  extending 
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ihc  process  absiraciion  lo  groups  of  processes,  verucal  siniciurc  provides  ihc  hierarchical  siruciure  typical  of  a 
largci  application.  By  making  the  represcniaiion  of  horizontal  and  venical  structure  explicit,  maintaining  it  during 
execution,  and  forcing  applicaiion.s  to  rcncci  their  representations,  actual  structure  can  be  enforced  using  the 
mechanisms  provided  for  dynamic  maintenance. 

1.2.1.  Explicit  Communication  Patterns 

To  gain  tlic  greatest  actual  honzonial  structure,  we  must  make  interfaces  and  the  bindings  between  them  as 
explicit  and  visible  as  the  associated  processes.  We  will  consider  only  communication  between  explicitly  identified 
partners.  Each  pair  of  partners  interacts  via  a  communication  medium,  whose  characicrisiics  do  not  concern  us  here. 
(Examples  are  TCP/IP  connccuons.  semaphores,  shared  files,  remote  procedure  call  bindings,  wires,  and  bNlX^ 
pipes.) 

A  process  has  a  fixed  set  of  communication  interfaces,  one  for  each  potential  partner,  that  may  be  thought  of 
as  endpoints  or  sockets  for  communication  media.  A  process’s  interfaces  art  distinguished  according  to  the  role 
played  by  the  partner  kernel  port,  logging,  auditing,  standard  input,  standard  output,  mailbox  server.^  A  connection 
is  an  instarice  of  a  communication  medium  joining  two  interfaces.  We  identify  connections  by  the  interfaces  at  their 
ends. 

We  will  now  consider  two  small  examples.  Take  a  typical  UNIX  pipeline  of  processes  A,  B,  and  C.  Every 
UNIX  process  has  a  conventional  abstraction:  three  interfaces  (file  descriptors)  for  standard  input,  standard  output, 
and  standard  error.  In  a  pipeline,  the  output  interface  of  one  process  shares  a  UNIX  pipe  with  the  input  inicrface  of 
the  next  process  in  line.  One  process  views  the  pipe  as  a  sink,  while  the  other  views  it  as  a  source.  The  remaining 
interfaces  arc  connected  to  the  lermuial  device  by  default.  In  Figure  1.1  this  pipeline  (without  the  terminal  cfcvice) 
is  shown  in  the  notation  that  uill  be  used  throughouL  Processes  are  drawn  as  shaded  rectangles;  interfaces  as  small 
labs  on  processes;  and  connections  as  heavy  lines. 


Figure  1.1.  Simple  Pipeline 

The  example  illustrated  in  FiRurt  1.2  rcmoce  procedure  call  (RPC)  bindings  rather  than  UNIX  pipes  as 
the  basic  communicadon  media.  Pro:  esses  FiteServer  and  NameServer  each  has  an  RPC  tnterfke.  corresponding 
to  its  external  cnincs.  to  provide  us  service.  Process  Client  has  two  interfaces,  corresponding  to  its  stubs,  one  to 
obtain  file  service  and  the  other  to  obuin  name  service.  Connections  between  interfaces  indicate  RPC  bindings.  It 
is  imponini  that  Client's  interfaces  arc  disungutshed  so  that  an  appropriate  server  can  be  bound  to  each  set  of  s^us. 

*  UNIX  It  •  re|tstcfta  uadcnuii  of  ATAT. 


*  Contwk;  t  K«nnd  kitnuSen  in  |BUt3] 
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Figure  1.2.  Simple  Cliem  and  Server 

We  are  noi  so  cavalier  about  the  semantics  of  communication  media  as  might  appear  from  this  introduction. 
I-aier  Chapters  explore  the  integration  of  multiplexing,  multicasting,  and  bundling  paths  into  this  initially  one-to-one 
model  of  communication,  the  implications  of  requiring  explicit  partners,  the  distinction  between  physical  and  logical 
media,  and  the  degree  to  which  several  common  interprocess  communication  mechanisms  fit  the  HPC  model. 

1.2.2.  Nested  Groups  of  Processes 

WTien  two  multiprocess  programs  are  connected,  the  boundary  between  them  is  lost  in  the  composition.  Tc 
retain  the  abstract  grouping  of  related  processes,  we  must  incorporate  vertical  structure.  An  HPC  object  is  a  named, 
active  entity  with  distinct  interfaces,  just  like  a  process.  However,  an  object  is  implemented  by  an  explicit 
composition  of  processes  that  can  be  described  and  manipulated  in  the  HPC  system,  while  processes  arc 
implemented  by  some  primitive  behavior  that  can  noL  The  boundary  between  the  extemal  abstraction  and  the 
internal  composition  of  an  object  is  a  shell. 

Objects  obviously  capture  nesting  or  vertical  scmciurc  at  one  level,  and  it  is  natural  to  extend  the  process 
abstraction  by  allowing  an  object  anywhere  a  process  could  be  used.  This  leads  immediately  to  a  hierarchy  or  otc 
of  object  iihcr  than  a  single  level  of  clustering  or  groupmg.  The  leaves  of  the  tree  are  real  processes  running  on 
some  machine.  All  the  nodes  of  the  tree  can  be  treated  like  real  processes,  but  the  internal  nodes  arc  collections  of 
abstract  proct^^es,  some  of  which  may  be  real  processes  and  some  of  which  may  themselves  be  collections.  Near 
the  top  of  Uic  tree  we  find  abstractions  dealing  with  activities  of  broad  scope  and  complexity-.  Near  the  bottom  arc 
abstractions  with  simple  behavior  and  limited  complexity. 

A  UNIX  pipeline  car  itself  be  used  as  a  component  in  a  iiiger  pipeline.  The  command  (a  i  8)  i  e:  i  w 
defines  a  pipeline  of  two  components,  each  of  which  is  a  pipeline  of  two  components.  However,  in  UNIX  this 
logical  nesting  is  completely  lost  by  the  time  the  command  is  implemented.  In  HPC  we  fcprescnt  the  nesting 
explicitly  as  shown  in  Figure  1.3.  A  shell  is  drawn  as  a  rectangle  surrounding  the  contents  of  the  corresponding 
object. 
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Figure  1.3.  Pipeline  of  Pipclinc.s 

By  encapsulaunc  our  complex  clicni  and  server  programs  as  objects  we  can  readily  separate  their  internal 
process  structures  frori]  the  interactions  between  ihem.  Figure  1.4  shows  how  the  Clieni  and  NameSer\er  of  Figure 
1.2  can  be  expanded  internally  to  mote  complex  structures  widiout  affecting  the  mteractions  between  them 


Figure  1.4.  MuldprocessClicmamd  Server 


Hierarchical  organixiuon  ts  c  natur  ^  *  'mtk  of  applying  the  principle  of  abstraction  geneixxisly.  It  is  a 
good  method  for  implementing  complex  ve.  be  lusc  ii  closely  ippruijmaics  the  structure  designers  actually 
use  in  creating  ihcir  applicauons  (Secf^  >  e  there  otha.  perhaps  better,  ways  lo  organize  complex 

applications  than  to  force  everything  imo  hit*.  -cs?  Our  response  is  'probably  not.’  To  quote  from  Simon; 

The  (k*,  then,  thii  m*ny  complex  xyt'^tts  h  •  rcxriy  decompoxxbU.  hkrtrduc  Mruaurc  u  a  nujor  fadliudni  fxc- 
lor  eitblini  us  to  undersund.  to  dcsinbt,  ir*o  lo  tee  mch  lyxicifu  «*d  thee  p«u.  Or  pestup*  (he  propotiuon 
fhould  be  pul  the  olhes  wt\*  round.  If  ihr»«  t  fsnporUM  lystemi  m  ihr  world  which  mt  conq>kx  w*khout  bet.'tj: 
hierarchic,  they  may  lo  •  oonaid  /abk  cxia  caetpe  out  oburvaikm  and  our  undcruandtn|;  Analyni  of  iheir 
behavio?  woukf  involve  jueh  de.  tied  knowledlge  «nd  calculation  of  thr  miaraoio^x  of  the  clesneniary  pa^u  shat  « 
would  be  beyond  our  capacities  »  memary  er  con^iuuuon.  (StfnlS21 


However.  siri:i  hierarchies  can  not  realistically  express  the  kind  of  sharing  and  access  to  global  resources 
needed  in  a  real  system.  Unless  the  mathematical  elegance  of  strict  functional  composition  is  the  sole  criterion,  this 
limitation  must  be  overcome.  This  issue  is  explored  in  detail  in  subsequent  Chapters. 

1.2J.  Active  versus  Passive  Hierarchies 

Even  after  a  decision  to  use  hierarchical  grouping  as  a  method  for  organizing  complex  softwa’^e,  there  remains 
a  choice  between  active  and  passive  hierarchies.  In  an  active  hierarchy  the  internal  nodes  in  the  tree  are  processes. 
All  interactions  with  a  subtree  are  actually  interactions  with  the  process  at  its  root.  In  a  passive  hierarchy  the 
internal  nodes  of  the  u-ee  are  abstractions.  All  processes  arc  at  the  leaves  ol  the  tree.  In  both  cases  we  assume  that, 
to  obtain  the  benefits  of  the  hierarchical  discipline,  a  node  can  be  connected  only  to  its  parent,  its  children  and  its 
immediate  siblmgs. 

To  some,  active  hierarchies  may  seem  like  the  natural  choice.  No  unfamiliar  abstract  objects  are  introduced, 
it  is  clear  where  control  of  a  process  group  resides,  and  the  pnxess  hierarchies  supported  by  most  existing  operating 
systems  (e.g.,  UNIX)  are  of  this  t>’pe.  However,  they  are  an  madequate  solutior.  to  the  problems  wc  arc  trying  to 
s  3lve.  Let  us  examine  the  obstacles  active  hierarchies  would  present 

First  of  all,  an  active  hierarchy  is  insufficiently  abstract  We  really  do  want  to  introduce  those  unfamiliar 
abstract  objects.  It  is  very  important  to  distinguish  a  complex  appUcation  from  the  processes  that  happen  to  be 
implementing  it  at  the  moment.  There  must  be  a  name  or  pbceholder  for  a  subtree  independent  of  its  root  process, 
or  ebe  the  root  process  can  not  be  transparently  replaced  by  another. 

Second,  active  hierarchies  do  not  extend  the  process  abstraction  well.  A  subtree  can  not  cleanly  replace 
arbitrary  process.  Replacing  a  leaf  with  a  subtiee,  or  vice  versa,  is  clean,  but  replacing  an  internal  node  (subtree 
root)  with  a  non-trivial  subtree  always  destroys  the  reUtion  its  children  had  with  the  internal  node,  and  generally 
destroys  the  sibling  relation  among  them  as  well.  Permissible  connections  depend  on  the  sibling  relation,  so  this  is  t. 
serious  problem. 

Third,  a  single  process  at  the  root  of  a  subtree  represents  an  unacceptable  potential  for  single -point  control 
failure.  Reduidani  control  is  critical  for  some  distributed  applications.  There  must  be  a  way  to  spread  control 
responsibilities  for  a  subtree  among  several  processes,  or  at  least  ensure  an  automatic  promotion  to  the  root  for 
alternate  processes  in  the  event  of  the  failure  of  the  current  root 

These  obstacles  can  be  overcome  by  Che  introduction  of  tomeihing  akin  to  our  abstract  object,  so  that  active 
hierarchies  art  now  trees  of  objects,  which  might  have  a  ttngie  process  or  an  active  hierarchy  within  them.  But  now 
the  active  hierarchy  has  become  a  passive  hierarchy!  Provision  for  redundant  control  requires  even  further 
extensions. 

1,2.4.  Dynamic  Process  Stmeture 

A  sulic  snapshot  of  an  application  is  intrinsically  simpler  than  a  specification  that  describes  how  the 
application  is  to  adapt  and  evolve  over  time,  in  an  open  system,  future  components  (and  the  future  policies 
governing  further  changes)  may  not  even  exist  when  a  snapshot  is  taken,  so  a  fully  general  specification  must  be 
open-ended  tn  some  sense;  it  can  not  encompass  all  future  configurations  of  the  application.  For  (his  reason,  we  use 
a  procedural,  rather  than  decUrauve,  description  ol  change,  and  define  the  mechanisms  by  which  an  appli  .;x.rion  may 
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be  modified,  rather  than  the  policies  governing  ihe  appropriate  modifications.  In  fact,  we  will  have  no  completely 
static  representations.  All  application  stnicture  is  described  dynamically;  even  a  nominally  static  stniciure  must  be 
built  incrementally  from  an  empty  structure  using  the  transformation  operations. 

There  are  six  ba.sic  operations  on  process  structure  in  HPC.  These  are  create  and  destroy  process,  create  and 
destroy  connection,  and  create  and  destroy  shell.  There  are  no  general  tree  editing  operations  such  as  move  subtree, 
only  operations  which  create  new  structure,  have  generally  local  effects,  and  their  inverses.  Creating  and  destroying 
processes  i.re  basic  operations  in  most  operating  systems,  but  the  other  operations  are  usually  limited  or  unavailable. 

T.here  is  also  an  operation  to  examine  the  process  structure  at  any  moment  This  gives  an  immediate 
advantage  over  existing  systems  for  inspecting  and  manipulating  multiproccss  programs.  Most  operating  systems 
can  provide  a  list  of  the  currently  active  processes,  but  not  a  list  of  which  processes  are  interacting  with  which 
others.  By  examining  the  process  structure,  we  can  tell  exactly  how  processes  are  interacting  and  what  the  logical 
significance  of  each  interaction  is. 

A  (distributed)  serx  icc  maintains  a  database  of  the  current  dynamic  process  structure,  and  translates  the  basic 
HPC  operations  into  the  necessary  low-level  host  operations  on  processes  and  so  forth.  This  HPC  service  runs  as  a 
user  program  on  top  of  conventional  host  operation  systems,  and  provides  an  abstract  .nicrfacc  for  application 
managers.  Most  operations  on  HPC  process  structure  require  only  local  database  manipulations,  and  involve  no 
physical  resources  from  the  underlying  hosts. 

IJ.  Related  viork 

In  some  ways,  HPC  represents  the  last  of  a  long  period  of  loosely-coupled  distributed  systems  work  at  the 
University  of  Rochester.  The  well-known  Rochester  Intelligent  Gateway  (BFL76]  distributed  operating  system  and 
FLITS  lFel79j  distributed  programming  language  established  a  strong  departmental  interest  in  an  asynchronous, 
message-passing  model  of  interaction.  Two  subsequent  projects.  Activities  (EFH82]  and  Super  lArySlJ  arc 
specially  related  to  HPC. 

The  ActiNiiies  work  is  the  primary  starting  p-:int  for  HPC.  The  activity  model  provides  a  tool  to  describe  the 
relationships  between  objects  involved  in  the  execution  of  a  shared,  distributed  task.  A  single  object  may  participate 
in  many  different  activities  and  a  single  activity  may  be  made  up  of  numerous  subactivities.  At  the  language  level, 
tags  are  used  to  identify  the  activity  affiliaticn  of  data  and  messages  [Hel84].  HPC  began  as  an  inempt  to  define,  in 
detail,  operadr.g  system  support  for  the  activity  model  It  quickly  diverged,  although  previous  wotk  on  activities 
had  important  influence  throughout 

Super  is  an  exploration  of  communication  via  broadcast  source-addreaed  messages,  and  of  how  language 
suppon  of  programs  using  such  a  medium  could  be  provided,  that  parallels  HPC  in  several  ways.  The  Super 
programming  language  provided  nested  groups  of  pr^  esses,  together  with  distinguished  processes  lo  oontiol  and 
manage  such  groups.  Both  of  these  constructs  are  prin  we  in  HPC  structural  representations.  Super  also  requires 
the  concept  of  a  secieiary  process  (communication  fillet);  Such  filters  arc  f^tseniJy  convenient  in  HPC.  but  arc 
neither  critical  nor  built  in  to  the  system.  We  indicate  these  paraUels  as  evidence  for  the  universal  nature  of  the 
structures,  at  least  in  a  loosely  coupled  environment,  as  Super  hid  no  direct  influence  on  HPC. 
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1.3.1.  CONIC 

Beyond  a  doubi.  the  related  work  most  closely  related  to  HPC  is  the  CONIC  toolkit,  developed  independently 
of,  and  earlier  than.  HPC  at  University  College  London  [K.MS87),  (KrM85].  Like  HPC,  CONIC  provides  for 
explicit  communication  interfaces  and  bindings,  a  process  abstraction  generalized  to  nested  groups  of  processes,  and 
fully  dynamic  structure.  The  CONIC  system  implementation  is  far  more  developed  than  the  HPC  system  prototype 
and  has  been  applied  to  several  industrial  applications. 

The  major  onginal  contributions  of  HPC  are  easiest  to  evaluate  by  comparison  to  CONIC.  First,  CONIC  as 
currently  implemented  provides  a  single  native  communication  mechanism,  while  HPC  was  designed  to  use  a 
variety  of  (heterogeneous'!  mechanisms.  As  a  result,  HPC  has  facilities  to  "type -check"  communication  paths  to 
ensure  each  logical  path  can  be  implemented.  More  significantly.  HPC  provides  ior  explicit  expression  of 
muluplexinp.  multicasting,  and  bundling  of  multiple  communication  paths  as  part  of  the  horizontal  structure  to 
capture  the  rehness  of  communication  media. 

CONIC  has  no  provision  for  protection  or  domains  of  autonomous  management.  Most  systems  that  use  a 
general  protection  system  such  as  access  control  lists  or  capabilities,  do  so  because  they  have  no  obvious  structure  to 
exploit.  HPC  uses  rich  vertical  structure  in  the  definition  of  protection  domains  that  provide  common  management 
for  related  objects.  In  addition,  the  "controls"  relation,  which  is  as  important  as  the  "communicates  with"  relation,  is 
manifest  in  HPC.  Control  behavior  is  subject  to  the  same  principles  of  abstraction  and  composition  as  other  process 
interactions. 

While  both  systems  concentrate  on  hierarchies  of  processes,  HPC  also  allows  exceptions  to  a  strict  tree.  This 
permits  more  natural  access  to  global  services  than  is  possible  in  a  tree.  The  more  general  graph  structures  resulting 
from  excepuons  have  to  carefully  disciplined  to  presene  the  behavior  of  a  strict  hierarchy,  while  still  allowing 
access  between  arbitrary  points  in  the  tree;  it  would  not  be  practical  in  HPC  without  exploiting  the  protection 
system. 

CONIC  and  HKT  had  somewhat  different  motivations  that  account  for  some  less  well  defined  differences. 
HPC  has  an  emphasis  on  extremely  distributed  systems.  This  led  to  features,  e.g,  continued  operation  during 
paruuon  and  reporting  end-to-end  connectivity  without  violating  abstraction  boundaries,  that  arc  not  as  well 
dev'clopcd  in  CONIC.  As  the  HPC  svstem  was  not  intended  as  a  stand-alone  system  nor  as  a  complete  one,  its 
relation  to  appUcaiion  processes  and  to  host  operating  systems  is  more  precisely  defined  than  in  CONIC.  In 
particular,  mampuliuons  of  rich,  abstract  application  struaurc  are  completely  distinguished  from  the  primitive 
operations  on  sparse,  nauve  processes  and  communication  media.  This  precise  abstract  structure,  in  turn,  has 
suggested  specific  research  into  additional  distributive  bws  for  fonnal  systems  such  Hoare’s  CSP  (HoaSS). 

1-5,2.  Task  Forces 

Since  the  bic  1970's.  scvcri!  bncs  of  operating  system  research  have  explored  an  explicit  form  of  smiauring 
for  multiple  processes  often  known  as  task  forces.  A  task  force  usually  consists  of  a  variable  number  of  processes 
performing  the  same  or  similar  functions. 

One  promineni  fine  of  research  stresses  the  independence  of  address  space  and  thread  of  control,  and  the 
rcsulung  efficiencies  due  to  shared  memory  communicaijon  and  faster  context  switches  between  processes  using  the 
same  address  space.  The  rJaicd  Thoth  {ChcS2J.  Verex  {Loc79].  and  V  kernel  {ChcfWj  systems,  and  the  unrelated 
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Mach  [VTR87],  fABB86]  sysiem  are  examples  of  this  development  While  the  grouping  of  processes  into  task 
forces  (rhoih.  tasks  in»o  leains,  Mach:  threads  into  tasks)  is  ver>'  well  defined  in  diese  systems,  there  is  liiJe  or  no 
SL'ppoti  furdici  strcci(;ni:g  of  processes  in  the  same  task  force,  or  of  multiple  task  forces.  Communication  is 
based  on  promiscuous  broadcast  within  a  group  or  other  mechanisms  (ports,  links,  addresses)  that  can  not  be  treated 
as  explicit,  visible  l  indings. 

A  second  line  of  development  emphasizes  a  object-invocation  model  of  interaction,  where  multiple  active 
processes  may  servocc  voncnrrent  invocations  of  a  given  object  Three  well-known  examples  are  Argus  [LiS83] 
Eden  [ABLSSj,  and  Clouds  [LeW85].  Communication  interfaces  in  these  systems  are  defined  and  controlled  ver>' 
clearly  on  the  receiving  side  of  a.i  invocation,  but  the  binding  between  calling  and  called  obiccts  is  left  implicit  in 
ihe  pattern  of  invocations  during  runtime.  Most  analyses  (e.g.,  for  deadlock  freedom,  or  for  debugging)  of 
applications  nm  on  such  systems  require  the  "calls"  or  "depends"  iel-uuu  between  objects,  which  again  suggests 
Uiat  the  horizontal  structure  should  be  manifest  in  the  structure,  as  in  HPC,  and  not  inferred  from  the  dynamic 
behavior. 

Vertical  process  structure  is  limited  to  the  single  level  of  process  clustering  within  objects.  However,  both 
Argus  and  Clouds  provide  additional  structuring  in  the  form  of  transactions  that  define  apparently  atomic  actixities 
that  may  insolve  processes  within  several  objects.  Even  ignoring  the  aspect  of  atomicity,  HPC  has  no  comparable 
facility  for  defining  activities  that  involve  several  applications,  perhaps  overlapping  with  other  such  activities,  only 
for  grouping  applications  into  a  larger  one.'* 

1  wo  important  systems  based  on  capability-controlled  access  and  message  passing  were  implemented  as  part 
of  the  CM*  multiprocessor  project  ((SFS77],  [SBL77]).  These  arc  StarOS  [JCD79],  and  Medura  [OusSl],  and  both 
systems  allow  the  creation  of  distributed  task  forces  of  cooperating  processes.  StarOS  focuses  on  case  of  use  and  a 
general  capability  mechanism,  while  Medusa  stresses  the  effect  of  distributed  hardware  on  system  software  (Section 
8.2.1  [OusSl]).  Medusa,  more  than  any  of  the  other  task  force  system,  addresses  the  smictuial  issues  central  to 
HPC. 

Each  Medusa  process  has  a  private  capability  list,  the  processes  in  a  task  force  share  a  list,  and  every  process 
has  access  to  a  list  of  global  utility  capabilities.  Horizontal  smjcture  in  Medusa  is  explicitly  controlled  by  these  lists, 
which  are  distinct  from  the  processes  that  may  access  them.  Each  slot  in  a  list  can  be  treated  as  an  abstract 
interface,  where  the  capability  in  the  slot  specifies  the  implemention.  This  definition  of  explicit  interfaces  is  so 
clean  and  comprehensive  that  the  complete  state  of  a  process,  including  its  memory  pages  and  access  to  secondary 
storage,  is  accessible  through  its  capability  lists.  One  pleasant  result  is  that  ooe  process  can  take  over  completely  for 
another  in  the  same  task  force  either  temporarily  nMiddy"  exception  handling)  or  permanently.  Dynamic  load 
balancing  and  system  reconfiguration  is  possible  by  replacing  the  capabilities  for  overloaded  or  failed  processes  on 
the  fly.  HPC  can  only  approximate  this  clean  replacement,  because  process  state  is  a  primitive  feature  outside  the 
HPC  structural  system. 

In  Medusa,  unlike  StarOS,  the  vertical  stneturing  of  processes  into  task  forces  is  maintained  during 
execution,  available  for  debugging  and  monitoring.  However,  Medusa  is  a  one  level  system.  There  is  no  facility  for 
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grouping  task  lorces  into  larger  applications,  and  utility  task  foiees  are  treated  somewhat  specially  by  the  system. 
The  same  directness  applies  to  communication;  while  capabilities  can  be  replaced,  they  can  not  be  chained, 
indirecied,  or  sent  in  messages.  One  result  is  that  a  task  force  must  explicitly  provide  for  each  of  its  users,  there  is 
no  w'ay  to  export  or  otherw  ise  transmit  access  to  a  task  force.  HPC  provides  much  more  powerful  organizing  tools 
in  this  area. 

13.3.  Software  Design  Tools 

Software  design  tools  emphasize  methodical  development,  clean  abstraction,  and  reuseable  modules.  This 
emphasis  often  encourages  (or  enforces)  software  architectures  that  have  a  great  deal  in  common  with  loosely 
coupled,  distributed  applications.  In  panicu%,  they  Invariably  provide  nested  abstractions  with  explicit  interfaces 
and  intermodule  bindings.  When  a  design  tool  provides  active  entities  as  a  basic  component,  the  possible  structures 
are  much  like  static  KPC  structures. 

The  SARA  system  is  one  such  tool  that  has  been  used  to  design,  model,  and  simulate  both  sequential  and 
concurrent  software  [EFR86].  SARA  distinguishes  vertical  and  horizontal  structure  to  the  extent  tliat  dilfcrent 
languages  are  used  to  describe  them  (PcB79],  but  goes  beyond  structuring  (sjmtax)  to  include  a  third  language  to 
describe  the  behavior  (semantics)  of  an  application.  Intcniionaliy,  there  is  no  comparable  feature  in  HPC. 

There  are  many  other  software  design  tools  with  some  relevance  to  HPC,  but  w'e  will  mention  in  passing  only 
SADT  [Ros85],  and  DREAM  [Rid81].  The  relationship  of  HPC  to  SARA,  along  with  these  other  tools,  is  more 
complementary  than  competitive.  Wfiile  HPC  has  nc  semantic  component,  and  the  design  tools  do  not  allow  for 
dynamic  structure  (failure,  reconfiguration,  etc),  the  methodology  and  design-time  suppon  provided  by  the  tools  can 
be  usefully  matched  by  the  run-dme  support  provided  by  HPC . 

This  match  has  been  realized  by  the  STE-E/GEM  combination  (SBW87].  STILE  is  a  software  d^ign  system 
of  the  type  discussed  here,  while  GEM  is  a  real  time  operating  system  for  robotic  applications.  The  GEM 
multiprocessors  are  physically  close  together,  the  software  environment  is  quite  loosely  coupled,  and  STILE/GEM 
allows  for  direct  run-time  support  of  design  abstractions,  explicit  distribution  and  (limited)  dynamic  reconfiguration. 
Because  GEM  was  intended  for  specific  applications  and  hardware  environments,  it  does  not  address  issues  like 
protection,  partition,  and  dynamic  p»'occss  creauon  ihai  arc  addressed  by  HPC.  Like  CONIC,  GEM  provides  a 
compicic,  native,  run-time  system,  while  HPC  docs  not. 

13.4.  Programming  Languages 

Several  programming  languages  support  nested  processes,  in  cither  pa.s$!ve  or  active  hierarchies.  A  .lumber 
of  these  languages  are  related  to  Hoarc’s  CSP  [Hoa85J.  cither  directly  or  by  parallel  development;  ECSP IBRT841, 
Occam  fTaW82].  Planci  ICrE84),  and  Platon  (SiS75]  me  four  examples.  Because  these  languages  express  stnicmrc 
directly  in  program  syntax,  they  have  very  limited  •'tiiliiy  lo  express  cither  dynamic  or  penistent  struciure.  ECSP, 
for  example,  allows  for  dynamic  reconfiguration,  but  all  potential  rr.txluk;  bindings  must  be  maniiest  in  the  original 
program.  Process  lifetimes  arc  Umiicd  to  a  strict  fork -join  discipline,  so  piocesses  can  not  be  detached  to  nin  on 
their  own,  nor  can  new  ones  be  added  to  a  group  once  it  has  been  created.  These  limitations  arc  intrinsic  to  any 
system  that  treats  each  application  (program)  as  a  closed  system.  HPC  manage*  applicafions  as  an  open  collection 
of  persistent  data  structures. 
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While  interfaces  and  bindings  betw'een  siblings  are  fairly  well  defined  in  CSP-like  languages,  the  bindings 
between  processes  at  greater  distances  in  me  tree  are  extended  by  scoping  and  visibility  rules  of  the  language. 
These  rules  limit  the  possible  patterns  of  communication  but  do  not  make  the  specific  patterns  explicit.  (In  ECSP, 
the  specific  patterns  are  not  even  decidable  in  general.)  A  more  serious  problem  is  that  each  process  must  name  its 
communication  partnerfs)  di'^ectiy  to  create  a  binding.  This  limits  the  use  of  abstraction,  because  n.  pr  ces:  must 
have  information  about  arbitrarily  distant  structure.  All  HPC  connections  are.  strictly  local,  which  allov/s  rigorous 
information  hiding,  and  bindings  are  computed  incrementally  on  the  basis  of  possibly  many  connections. 

The  PRONET  [LeM82]  progtamming  language  is  more  attractive  for  extremely  distributed  software  than  the 
CSP-like  languages.  It  uses  a  separate  sublanguage  (NKTSLA)  for  explicit  module  interconnection  that  avoids  the 
objections  raised  in  the  previous  paragraph.  Modules  and  intermodule  bindings  can  be  created  and  destroyed  at  any 
time,  and  the  contents  of  each  abstract  object  arc  managed  and  controlled  by  precisely  the  NETSLA  code  that 
created  the  abstract  object  Rich  communication  structure,  for  example  explicit  multiplexing,  multicasting,  and 
bundling  of  interfaces,  can  be  expressed  in  the  module  definition  sublanguage  (ALSTEN). 

However,  PRONET  still  suffers  from  a  closed  world  assumption,  because  there  is  no  way  to  introduce  new 
types  of  modules  into  a  running  system  nor  any  way  to  name  or  communicate  with  an  independent  program  (c.g., 
contact  a  globeJ  senuce).  While  async.hronous  creation  and  destruction  of  modules  arc  addi^sed  by  PRONET. 
more  general  issues  of  decentralized  control  (partition  and  multiple  agents)  arc  nok  In  fact,  the  control  and 
management  portions  of  a  PRONET  program  are  implicitly  single  threaded  and  centralized,  even  if  the  primidve 
processes  run  concunenily. 

Tliere  arc  of  course  very  many  other  programming  languages  that  allow  multiple  proccs.scs.  The  ones 
discussed  he  e  are  most  relevant  to  HPC  and  the  structures  it  can  express,  and  spnee  prohibits  even  a  simple  listing 
of  such  languages.  For  example,  DPL-82  (Ei-i821  is  a  language  very  diffeient  from  PRONET  and  the  CSP-like 
languages  that  also  provides  nesied  groups  of  processes  with  explicit  intcrfoccs  and  bindings,  but  the  similarity  adds 
nothing  significant 
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2.  Hierarchical  Process  Conposition 

Black  boxes  are  good  physical  absiractions.  They  are  distinguished  by  the  sockets  on  their  exterior  panels 
and  their  behavior  at  those  sockets.  Their  internal  structure  and  state  are  not  accessible.  All  we  know  about  a  black 
box  is  its  name,  its  sockets  for  cables,  and  the  signals  it  produces  at  those  sockets. 

Interesting  ensembles  of  several  black  boxes  are  created  by  connecting  pairs  of  sockets  with  cables.  In  such 
an  ensemble,  any  cable  may  be  replaced  by  an  equivalent  one,  but  generally  no  box  or  socket  can  be  substituted  for 
any  other.  Therefore,  we  mu.si  distinguish  boxes  and  sockets,  but  need  not  distinguish  cables. 

We  often  wish  to  enclose  an  interconnected  collection  of  boxes  in  a  chassis  or  cabinet  for  convenience  (or 
perhaps,  to  introduce  a  level  of  abstraction).  This  w'ill  hide  all  the  internal  wiring  that  is  irrelevant  to  user  of  the 
overall  collection.  To  accomplish  this,  we  must  have  some  sockets  that  pass  entirely  through  the  cabinet  On  the 
exterior,  the  cabinet  appears  to  be  a  black  box  and  the  sockets  arc  used  as  we  have  already  described.  On  the 
interior,  the  sockets  must  be  connected  to  the  "free"  sockets  of  the  connected  black  boxes. 

Objects,  interfaces,  connections,  and  shells  arc  the  HPC  analogues  to  black  boxes,  sockets,  cables  and 
cabinets.  The  hierarchical  organization  of  modem  electronic  equipment  is  clear:  gates  composed  into  integrated 
circuits,  integrated  circuits  and  discrete  components  composed  into  circui'  boards,  boards  composed  into  shared  bus 
modules,  modules  composed  into  computers,  computers  composed  into  networks  ...  At  each  level  there  is  an 
explicit  composition  of  black  boxes,  w'hich  arc  abstractions  of  the  next  lower  level.  This  is  the  way  HPC  uses  the 
principles  of  abstraction  and  composition  to  organize  complex,  muliiproccss  programs. 

However,  a  realistic  system  must  be  fleshed  out  with  protection  against  unauthorized  access  or  interference 
bciwcen  applications,  escapes  from  the  strict  object  hierarchy  when  appropriate,  and  provision  for  a  rich  set  of 
interprocess  communication  pauems.  The  interactions  of  these  additional  structural  leamres  are  the  subject  of  much 
of  this  dissenabon.  They  are  introduced  in  Section  2.1,  and  examined  in  detail  in  the  subsequent  three  Chapters. 

HPC,  unlike  black  boxes,  supports  dynamic  reconfiguration  of  both  abstracbons  and  composiuons  under 
program  control.  The  system  provides  operabons  that  incrementally  modify  process  structure  during  execuuon. 
Section  2.2  presents  the  entire  set  of  fourteen  HPC  operabons.  orgaruzed  by  ihie  structural  features  they  affccL 

The  first  significant  interacuon  between  muluprocess  abstracbons  and  the  distributed  environment  is  a 
requirement  for  an  asynchronous  system  interface.  Secuon  2.3  notes  the  clash  between  synchronous  operabons  on 
process  structure  and  the  asynchrony  among  agents  and  between  agents  and  sources  of  failure.  By  stressing 
dynamic  structure,  we  are  led  to  adopt  an  unconvenbonal  system  interface  when  compared  to  most  distributed 
software  systems. 

2.1.  Structural  Keatures 

2.1.1.  Dual  Represenutioo 

Nested  HPC  shells  define  a  rooted  tree  of  objects.  A  tree  is  often  the  most  intuibve  representabon  for 
complex  applicabons.  but  a  more  general  representabon  is  sometimes  needed.  The  HPC  dual  graph  captures  the 
desired  escapes  from  a  strict  hierarchy  and  simplifies  the  presentation  of  the  protection  and  communication 


structures. 
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The  hierarchy  emphasizes  vcnical  siruciure,  where  objects  arc  nodes  and  the  parent-child  relation  defines  the 
edges.  The  dual  graph'  emphasizes  the  horizontal  structure  of  a  system,  where  compositions  arc  nodes  and  the 
abstract  interfaces  define  the  edges.  Both  interpretations  of  a  small  object  are  shown  in  Figure  2.1.  (Hierarchies  arc 
always  drawn  as  nested  rectangles,  while  dual  graphs  are  always  drawn  using  circles.)  Shells  above  and  edges 
below  correspond  directly,  while  nodes  in  the  dual  graph  represent  spaces  between  shells  in  the  hierarchy.  Figure 
2.1  successively  highlights  each  dual  node  and  its  hierarchical  equivalent 


Figure  2.1.  Two  Representations  of  a  Small  Object 

The  parent  and  child  shells  in  the  hierarchy  arc  the  edges  incident  on  a  space  in  the  dual  graph.  A  space  has 
one  set  of  inicifaccs  from  each  of  these  edges.  In  a  simple  space,  these  interfaces  are  accessed  directly  by  a  process, 
and  in  a  complex  space,  they  are  composed  by  a  set  of  HPC  connections.  An  active  hierarchy  would  have  simple 
spaces  throughout,  while  the  passive  HPC  hierarchy  has  simple  spaces  only  at  the  leaves. 

The  dual  graph  has  a  distinguished  root  space  and  the  subtrees  rooted  there  form  a  forest  of  hierarchical 
objects.  These  top-level  objects  arc  the  most  independent  activities  in  the  system,  generally  corresponding  to  user 
terminal  sessions  and  to  long-lived  system  services. 

Every  HPC  hierarchy  has  a  dual  graph,  but  Figure  2.2  demonstrates  a  dual  graph  that  can  not  be  expressed  as 
a  hierarchy.  The  dual  graph  ignores  the  hierarchical  orientation,  which  simplifies  several  technical  definitions  and 
allows  a  uniform  treatment  of  composition  in  strictly  tree-like  and  cyclic  process  stnicuires. 


*  AoutUy.  •  tine  gruph  for  ihote  fuiiy  about  |r«ph-4hciof«iic  icrminottify. 
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Figure  2.2.  A  Non-Hierarchical  Stnicuire 


2.1.2.  Protection  and  Control  Structure 

The  basic  concepts  of  the  standard  protection  model  are  domain  and  agent  (or  principal)  (Jon79].  A  domain 
is  a  set  of  entities  and  specific  operations  on  them.  An  ag*nt  is  an  active  entity  authorized  to  apply  a  domain’s 
operations  on  its  entities.  In  general,  the  mappings  between  agents  and  domains,  and  between  domains  and 
contents,  may  be  many-io-many.  That  is.  domains  may  have  several  agents,  an  operation  may  be  in  several 
domains,  and  so  forth.  The  protection  relation  is  the  mapping  among  agents,  domains,  and  domain  contents.  In 
conventional  protection  systems,  the  protection  relation  is  associated  either  with  the  domains  (access  control  lists), 
or  with  the  agents  (capability  lists).  In  both  cases,  it  is  difficult  to  determine  the  inverse  mapping. 

HPC  exploits  the  coherence  and  locality  of  the  rich,  explicit  process  stnictuit  to  define  protection  domains, 
instead  of  using  a  conventional  protection  system  that  makes  few,  if  any,  assumptions  about  the  structure  of 
domains.  Domains  are  disjoint,  connected  subtrees  in  the  dual  structure  graph.  Every  space  belongs  to  exactly  one 
domain.  In  the  hierarchical  view,  this  defines  a  coarse  hierarchy  of  domains  on  top  of  the  hierarchy  of  objects. 
Some,  but  not  all,  shells  delineate  domains  while  all  shells  delineate  objects.  Figure  2.3  illustrates  three  domains. 
(Domain  boundaries  arc  drawn  as  double  lines.) 


Figure  2.3.  Three  Adjacent  Domains 

All  features  of  a  space  and  all  operations  on  them  arc  in  its  domain,  but  operations  requiring  operands  in 
multiple  domains  arc  not  included  in  any  domain.  This  omission  is  the  fundamental  protection  feature,  and 
cfTcctivcly  confines  the  effects  of  an  operation  to  a  single  domain.  An  agent  either  has  complete  control  over  a 
domain  or  none  at  all  The  protection  mechanism  provides  no  form  of  limited  access  such  as  "read  only“  but  we 
will  see  this  docs  not  impede  arbitrarily  compilex  access  policies. 
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Opcraiions  on  HPC  siniciorc  arc  provided  through  a  buill-in  controller  object  in  each  domain.  A  conU’oller 
accepts  control  messages  from  connected  objects  and  invokes  the  corresponding  operations  on  structure.  It  refuses 
to  caiT>'  out  operations  out.sidc  i«s  domain.  (The  conu-oller  docs  not  have  any  physical  existence,  it  is  simply  an 
explicit  placeholder  for  the  HPC  system,  and  therefore  can  be  made  robust,  available  cvery'whcrc  and  whenever  the 
HPC  system  is  working. 

Objects  connected  to  a  controller,  and  thus  able  to  exert  their  conu-ol  over  a  domain,  arc,  de  facto,  agents. 
Multiprocess  agent  objects  immediately  allow  redundant  control  of  robust  or  decentralized  applications.  However, 
partition  or  site  failure  can  lead  to  situations  where  no  agent  is  physically  able  to  exert  its  control  over  a  domain. 
The  HPC  system  provides  positive  control  to  ensure  every  domain  is  under  control  of  some  agent  at  all  times. 

A  small  domain  is  shown  in  Figure  2.4.  Domain  D  is  a  shon  pipeline  implemented  by  two  worker  objects,  A, 
and  B.  Object  M  manages  the  domain,  monitoring  and  instructing  A,  and  sending  HPC  commands  to  the  controller 
C  as  needed.  (Controller  objects  are  drawn  in  dark  grey.) 


Figure  2.4.  Domain  with  Agent 

Chapter  3  discusses  protcctio  and  control  structure  in  more  detail,  especially  posiuve  control  and  the  user 
deiiniiion  of  arbitrary'  access  policies. 

2.1  J.  Communication  Structure 

Composition  is  fundamental  to  the  behavior  of  any  system,  while  abstraction  is  merely  an  unavoidable 
convenience.  That  is.  raw,  primitive,  individual  components  can  be  composed  into  a  useful  system  without  the 
benefits  of  further  abstraction,  while  complex  abstractions  remain  isolated  and  collectively  useless  without  a  way  of 
speafying  interactions  between  them  and  within  them.  Therefore,  it  is  not  surprising  that  communication  stniaure. 
defining  the  possible  compositions  of  objects,  is  the  most  complex  part  of  HPC. 

HPC  supports  heterogenous  communication  mechanisms  and  manipulations  of  multiple  related 
communication  paths  in  a  single  operation.  It  also  extends  the  one-io-onc  communication  patterns  shown  so  far 
with  a  general  many-io*many  multicasting  facility.  These  features  are  introduced  here,  with  considerably  more 
detail  in  Chapter  4. 

Both  sides  of  an  interface  can  be  manipulated  separately.  The  independent  parts,  called  views,  are  basic 
building  blocks  that  can  be  combined  in  two  ways.  Pairs  of  views  can  be  connected  to  form  links  in  a 
communication  path,  and  views  can  be  assembled  into  tree  structures  to  form  more  complex  interfaces 


Simple  views  control  individual  communication  media  such  as  a  UNIX  pipe,  a  TCP/IP  connection,  or  a 
remote  procedure  call  binding.  When  a  view  is  cieated,  the  appropriate  mechanism  is  specified  together  with  any 
constraints  such  as  oneniation.  For  example,  pipes  and  RPC  bindings  are  always  oriented  (out/m  and  client/server, 
respccuvcly)  while  TCP/IP  connections  are  generally  not.  Some  simple  view  structures  are: 

(  strjcxure;  si."cle  i.--  ’ 

1  siruclurts;  simple  "iC? 'I?  ir.-aci  ] 

(  stru<X'jns:  su^le  Co^;er  client  ] 

We  say  views  arc  compatible  when  their  specified  mechanisms  arc  the  same  (and  their  constraints  arc 
complcmcniar>).  HPC  requires  that  both  views  of  an  interface,  and  both  views  of  a  connection,  be  compatible  to 
ensure  a  transpon  medium  can  be  implemented  for  each  logical  communication  path. 

Often  it  is  desirable  to  connect  two  objects  with  a  single  abstract  medium,  while  allowing  an  implementation 
using  more  than  one  physical  communication  channel.  A  physical  example  is  a  cable  with  multiple  conductors,  the 
programming  analogy  is  the  record  data  structure,  and  distributed  program  examples  arc  paired  half-duplex 
channels,  and  out-of-band  channels.  To  support  this  kind  of  grouping  we  use  bundle  views.  A  bundle  has  a  fixed, 
ordered  set  of  possibly  heterogeneous  component  structures.  The  bundle  corresponding  to  the  conventional  ITNIX 
standard  10  interface  is:^ 

[  role:  "ITO:  r.d::-, 
st^jcrune:  bxio.-e 
(  role: 

strucoire:  staple  'JKIX-r.rw.  in 

) 

J  role: 

sirjcoure:  srple  ',?0;-r.rei-  c-"- 

I 

(  role:  "stderr", 

st,rjcr:ure:  sinple  '«JMX-s:.rtka.-'  oj: 

Another  common  requirement  is  for  dynamically  changing  numbers  of  homogeneous  media.  A  physical 
example  is  the  trunk  lu)e ',  the  programming  analogy  is  the  set  data  structure,  and  a  distributed  program  example  is  a 
multiplexed  service.  This  motivates  the  multiplex  view.  A  multiplex  view  has  a  single,  fixed  component  structure, 
but  varying  numbers  of  component  views  with  that  structure  can  be  fi^ly  created  and  destroyed  during  execution. 
Each  view  represents  a  distinct  communication  channel. 


*  The  no*  keyword  u  mtroiduocd  here  u>  d«nfy  the  uanpk.  lu  full  m^iUtcucc  will  be  expUined  later 
'  Coniidennf  virtual  circuiu  at  tbe  medu.  mnead  d  ihc  wvu. 


Figure  2.5.  Multiplexed  Server 

The  X  Window  system  is  a  i>'pical  multiplexed  service.  Clients  of  the  X  Window  system  each  have  a  view 
w'iih  this  simple  structure 

(  rcle:  "X  Wi-Tiir- 
*tnjcxur«:  sirple  TC?'  I?  rT-ojt 

1 

while  the  network  interface  of  the  X  Window  system  server  has  this  multiplex  structure 

I  role:  "X  Vt.'tio.  scr.'er'’. 
structure:  n-li ;rle> 

1  role:  **X  Kjtqou-  ser.-ioe". 
strv3C:ure:  sircle  TCP/I? 

1 

] 


The  different  kinds  of  complex  views  can  be  combined  hierarchically.  Here,  for  example,  is  a  multiplex  view 
with  a  bundle  component,  the  bundle  having  two  simple  components: 


(  role:  "two-f-ncrior 
SLTUGCure:  ruItl;p.cA 
(  tvle:  "t^ackjge  binding, 
stnxcute:  bcrdle 


I  role:  "fmc-1 

«.rjcxure:  sLrple  CoL:rier  aerver 

J 

!  role: 

5^rjc::ure:  Cok-rie:  aerver 
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One-io<nuny  and  many-to-many  communication  patterns  are  a  third  important  class  of  relationships. 
Physical  examples  are  the  bus  and  triple  modular  redundancy  systems,  a  programming  example  is  the  PORTKAN 
common  block,  and  a  distributed  program  example  is  the  oontiaa  bidding  algorithm.  HPC  must  demonstraie  that 
these  relationships  can  be  expressed  in  abstract  stnioiirc.  rather  than  physical  ttSdiessing. 

The  multicast  view  is  the  building  block  for  all  Oi:  ;-io*miny  and  many-to-many  communication  padenu. 
Like  multiplex  views,  multicast  views  are  defined  with  a  single,  fixed  oomponem  structure,  and  component  views 
can  be  created  dynamically.  However,  each  component  of  a  multiplex  view  represents  a  distifKt  rrredium  passing 
through  the  view,  while  all  components  of  a  multicast  view  represent  a  single  medium.  Messages  vrivtng  at  any 
multicast  component  on  one  side  of  an  interface  depart  from  all  components  on  the  other  side.  In  Figure  2^.  a 
message  sent  by  Multi  will  be  dcLvered  to  both  and  Uni-l. 


Uni-1 


I 

L 


Figure  2.6.  Muliicasiing 

2.1.4.  Non-Hierarchical  Structure 

So  far,  our  structural  representation  allows  only  trees.  Every  space  defines  a  purely  functional  composition  of 
the  adjacent  subtrees.  From  the  viewpoint  of  a  functional  programming  purist,  this  purity  of  composition  is  very 
nice.  From  the  viewpoint  of  a  practical  system  designer,  it  can  be  more  trouble  than  it  is  worth. 

To  use  a  module  (object)  in  a  purely  functional  fashion,  its  user  must  provide  explicitly  all  needed  external 
resources.  Uninteresting  plumbing  accumulates  higher  in  the  object  hierarchy  for  the  sole  purpose  of  connecting 
global  services  to  modules  at  lower  levels.  The  housekeeping  and  clutter  hide  the  structure  relevant  to  the  object’s 
behavior. 


Figure  2.7.  Gutter  in  a  Strict  Tree 

The  problem  is  worse  in  an  open  system.  To  make  use  of  a  newly  installed  service  in  a  purely  futKOonal 
system,  the  shells  between  the  service  and  its  client  must  be  ripped  cut  and  recreated  with  an  additional  interface  to 
pass  the  new  service.  This  is  clearly  tnadeqtiaif  and  motivates  violations  of  the  strict  hierarchy. 

To  provide  the  necessary  escape,  we  lo  any  two  complex  spaces  be  joined  by  a  splice.  For  communication,  a 
splice  behaves  just  like  a  shell,  providing  a  aei  of  interfaces  that  communicate  between  the  spaces.  However,  bed. 
ends  of  a  splice  appear  to  be  domain  boundaries  pointing  *down*.  This  preserves  the  appearance  of  a  sthci  uueaed 
tree  even  when  a  splice  joins  two  spaces  in  the  same  domain  (even  ortc  space  lo  itself)-  Chapter  5  discusses  non- 
hierarchical  stmaure  further. 


2.2.  Structural  Operations 


2.2.1.  Examination 

An  accnt  rr.usi  know  the  current  process  structure  before  it  can  make  scp.sible  decisions  about  how  to  manage 
It.  The  inquire  operauen  provides  the  nccessarx’  access  to  the  HPC  system’s  structural  database.  It  takes  a 
structural  element,  such  as  a  shell  or  an  interface,  and  returtis  information  about  its  properties  and  its  neighborhood 
in  the  process  structure.  The  process  structure  is  unmodified  by  examination. 

There  arc  spcciabzed  inquiries  lor; 

•  The  parent  and  children  of  a  shell 

•  The  interfaces  of  a  shell 

•  The  parent  and  child'cn  of  z  view 

•  The  shell  of  an  view 

•  The  view  (if  any)  connected  to  a  view 

•  The  root  shell  and  controller  of  a  domain 

•  The  kind  of  structural  element  represented  by  an  arbitrary  name 

The  data  returned  from  these  inquiries  is  sufficient  to  write  simple  tree-search  algorithms  to  traverse  a  domain 
exhaustively.  Different  database  access  mechanisms  could  have  been  provided.  For  example,  a  snapshot  of  the 
entire  domain  could  be  provided  for  off-line  consideration,  but  such  snapshots  may  be  arbitrarily  large  and  become 
out-of-date  with  the  slightest  change  to  Uic  domain. 

Because  inquire  is  an  operation  on  smicture,  the  protection  system  makes  domain  boundaries  opaque.  Only 
agents  of  a  domain  can  examine  that  domain’s  internal  structure. 

2.2.2.  Connections 

Communication  paths  between  processes  arc  incrementally  acaied  and  destroyed  by  creating  connections 
bciwc-tn  objects  The  connect  operation  takes  two  views  as  arguments  and  creates  a  connection  between  them.  The 
views  must  belong  to  the  same  space,  be  distinct,  compatible,  and  have  no  existing  connections.  Disconnect  is  the 
inverse  operations.  It  takes  two  views  that  must  be  joined  by  a  connection,  and  removes  the  connection. 


Figure  2.8  Connect  and  Disconnect 


riJ.  Media 

Wnen  a  of  wnnect  operations  his  created  a  complete  logical  path  between  two  process.;!,  the  HPC 
systi  n  creates  a  communication  modicm.  The  processes  then  communicate  using  the  operations  appropriate  to  Uic 
ir.cd  um  Typical  operations  arc  send  and  receive  for  messages  and  daugrams.  write  and  read  for  fiics.  pipes,  and 


26 


sircams,  and  call,  accept  and  reply  for  remote  procedure  calls. 

The  HPC  sys  cm  makes  these  operations  available  for  each  communication  mechanism  it  supports,  but  docs 
not  define  their  semantics  or  Uie  behavior  of  the  communication  medium.  Operations  on  media  have  no  effect  on 
HPC  process  structure 

2.2.4.  Pr(x:esses 

To  create  a  new  process,  the  animate  operation  takes  a  empty  object,  a  host  identifier,  a  host-dependent 
image  identifier,  and  a  list  of  strings.  The  host  is  contacted,  the  image  is  started  on  the  host,  and  the  new  process  is 
placed  in  the  empty  object.  The  process  receives  the  strings  as  initial  arguments  in  a  host-dependent  way.  There  arc 
two  versions  of  the  inverse  operation.  A  process  uses  die  to  terminate  itself.  The  kill  operation  terminates  another 
process.  Both  operations  take  a  simple  domain  as  argument  and  destroy  its  internal  process,  leaving  an  empty 
object.  (Processes  arc  shown  in  light  grey,  empty  spaces  with  a  dot) 


Animate  (S,  host,  image,  parameters ) 


Kill  (S)/ Die  (S) 


L 


Figure  2.9.  Animate.  Die,  and  Kill 

These  operations  also  apply  to  complex  objects.  A  process  decides  to  become  a  simple  or  complex  object 
during  its  animation  in  a  ncgotiaiion  with  the  HPC  system.  This  decision  is  not  visible  to  the  agent  invoking 
animate.  To  animate  a  complex  domain,  two  new  empty  objects  arc  created  inside  the  previously  empty  leaf,  a 
controller  and  the  new  process  arc  placed  inside  the  new  objects,  and  a  connection  is  created  to  join  them. 


Figure  2.10.  Complex  Animation 

As  a  convenience,  kill  and  die  can  be  applied  to  aibitrary  subtrees.  The  entire  subuce  is  removed,  no  matter 
how  complex  or  how  many  subordinate  domains  arc  affected. 

22S.  Shells 

A  shell  is  created  explicitly  by  calling  the  enclose  operation  with  a  space,  a  partition  of  its  incident  interfaces, 
and  a  hsi  of  interface  descriptions.  There  must  be  no  connections  between  interfaces  in  diffcrcni  partitions.  The 
space  is  divided  into  two  spaces,  both  in  the  same  domain,  and  one  group  of  interfaces  is  moved  to  each  space  The 
spaces  arc  joined  by  a  new  shell  with  the  desired  interfaces.  To  destroy  a  shell  using  disclose,  ns  interfaces  must 


have  iio  conncciions.  and  ihc  spaces  ii  joins  musi  be  in  the  same  domain.  Those  spaces  and  their  incident  edges  are 
merged  together  into  a  single  space,  and  the  shell  is  destroyed. 


Figure  2. 1 1 .  Enclose  and  Disclose 

The  names  of  these  operations  arc  taken  from  their  effects  in  the  hierarchical  view.  Creating  a  shell  has  the 
effect  of  surrounding  the  lower  partition  of  shells,  while  destroying  a  shell  releases  the  internal  stniciurc. 
because  both  a  space  and  a  bipartition  arc  implicitly  defined  by  a  set  of  sibling  objects  in  the  hierarchy,  the  actual 
arguments  for  enclose  arc  just  a  list  of  shells  and  a  list  of  interface  descriptions.  Spaces  arc  never  explicitly 
manipulated  in  HPC,  only  shells. 

2.2.6.  Interfaces 

Most  interfaces  arc  automatically  created  and  destroyed  when  their  parent  or  aitached  shell  is  manipulated, 
but  multicast  and  multiplex  interfaces  have  a  dynamically  varying  number  of  ermponent  views.  Each  such 
component  is  created  by  the  new  operation  on  its  parent  view.  A  component  view  is  destroyed  by  the  delete 
operation.  It  must  be  a  component  of  a  multicast  or  multiplex  view,  and  must  not  be  connected. 


New  (P) 

L 

P 

- ^ 

,p 

- 

Delete  (C) 

Figurc2.12.  New  and  Delete 


2^7.  Domains 

The  domain  creauon  operation,  invest,  takes  a  shell  and  an  empty  object  as  arguments.  The  object  aitd  the 
existing  controller  must  be  on  opposite  sides  of  the  shell.  The  existing  domain  is  reduced  to  Che  lubcree  on  the 
controller's  side  of  the  shell,  and  the  rest  of  the  tree  becomes  a  new  domain.  The  ooniroller  for  the  new  domain  is 
placed  in  the  empty  objocL 

The  abdicate  operation  takes  as  its  argurrent  in  adjacent  domain  boundary  lo  remove.  The  Mfiryngg 
controller  is  removed,  and  the  domain  is  merged  w  ith  the  adjacent  domain.  The  depose  is  simtlr',  but  lakes  control 
from  the  adjacent  domain  instead  of  yielding  contrvl  to  it.  Figure  2.13  illusuiies  these  openbons.  The  argumei^ts 
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for  domain  operations  must  be  in  the  domain  of  the  agent,  just  as  for  any  other  operations.  An  edge  with  an  ellipsis 
(...)  represents  a  sequence  of  one  or  more  shells  that  must  belong  to  a  single  domain. 


2.2.8.  Splices 

Splices  are  created  between  existing  empty  objects  using  a  two-step  operation.  Suppose  objects  A  and  5  are 
to  be  spliced. 


Figure  2.14.  Splice  (Before) 

The  agent  for  A  invokes  the  spUce  operauon  with  two  empty  shells.  The  existing  shell  A  and  space  arc 
replaced  by  the  splice  C.  one  end  of  which  is  hidden  in  a  space  accessible  only  to  the  HPC  system.  There  is  no 
immediate  effect  on  B. 


Figure  2.15.  Splice  (During) 
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When  ihe  agent  for  B  decides  to  complete  the  splice,  it  invokes  splice  with  the  arguments  reversed.  The 
existing  shell  B  and  space  are  replaced  by  the  previously  hidden  end  of  the  splice  C. 


Figure  2.16.  Splice  (After) 


23.  Discussion 

We  begin  to  find  interactions  between  multiprocess  abstractions  and  the  distributed  environment  already. 
Failures,  concurrency,  and  dynamic  structure  lead  to  a  model  of  interaction  between  application  managers  and  the 
operating  system  quite  different  from  most  distributed  systems. 

A  domain  may  have  several  agents  (or  a  multiprocess  agent)  and  the  agent  processes  may  choose  to  operate 
on  a  common  piece  of  process  structure.  Executing  on  separate  sites,  their  decisions  arc  intrinsically  asynchronous 
with  respect  to  one  another.  In  the  absence  of  failures,  cooperative  action  could  be  left  to  an  explicit  distributed 
consensus  among  agents,  or  provided  in  some  form  through  communication  between  agents  and  the  HPC  system. 
However,  failures  of  processes  and  hosts  will  occur  at  arbitrary  times  even  if  decision  making  among  agents  is 
synchronized.  The  environment  is  an  additional  agent  with  whom  we  can  not  debate  or  negotiate.  As  a  result*  an 
agent’s  view  of  process  structure  can  become  out-of-date  at  any  time,  even  while  trying  to  decide  what  to  do  about 
the  previous  state. 

Transaction  facilities  arc  one  conventional  response  to  this  possibility.  They  prevent  the  appearance  of 
outside  interference  during  a  sequence  of  operations,  but  they  do  so  by  delaying  or  undoing  operations  in  order  to 
obtain  the  desired  serialization.  A  transaction  that  is  affected  by  a  failure  will  be  aborted,  and  the  system  reset  to  its 
original  state.  This  is  inappropriate  for  applications  that  must  always  make  forward  progress,  arc  fundamentally 
asynchronous  and  therefore  non-serial,  must  meet  certain  minimum  performance  requirements,  or,  most 
significantly,  must  adapt  explicitly  to  failures  and  c.hanging  conditions.  Obviously  failures  and  other  changes  must 
not  be  hidden  from  applications  in  the  latter  class. 

This  raises  a  classic  question:  Should  an  agent  monitor  the  process  structure  by  polling,  or  should  it  receix’c 
sui  asynchronous  notification  of  a  change?  Polling  is  obviously  the  wrong  answer.  An  agent  could  easily  qiend  all 
its  time  fuiilcly  looking  for  something  that  clianged  since  the  last  time  it  was  examined.  Therefore,  any  system 
designed  for  adaptation  to  failure  under  user  control  must,  a:  a  minimum,  provide  asynchronous  notification  of  the 
lailurc  of  processes  and  communication  links  (partition). 

Any  agent^is  a  potentially  disuibuied  object,  capable  of  carrying  out  several  plans  of  action  concurrently.  If 
cn  agent  blocked  or  was  otherwise  jwevented  from  invoking  an  operation  until  some  previous  operation  completed, 
this  dcsircabic  concurrency  would  be  eliminated.  HPC  obtains  a  mor  powerful  system  interface  by  allowing 
structural  operations  to  proceed  asytv  ;hronously  with  respect  to  the  invoking  agent,  and  using  the  asynchronous 
notification  sysicm  to  repon  the  results  of  agcni-rcqucstcd  operations  as  well  as  process  and  network  failures. 
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The  HPC  system  gives  operations  an  at-mosi-once  semantics  within  each  partition,  internally  synchronizing 
u’hen  necessary  to  preser\’e  the  internal  consistency  of  the  process  structure.  An  agent  will  see  a  serialization  of 
operations,  but  it  may  be  a  different  sequence  for  each  agent  due  to  asynchrony. 

The  disadvantage  of  this  more  flexible  interface  is  that  agents  must  be  prepared  for  asynchronous 
interruptions  of  their  plans  and  arbitrary  delays  while  waiting  for  an  operation  to  complete  (or  fail).  Most 
distributed  systems  present  a  synchronous  interface  to  their  clients,  usually  in  terms  of  local  system  calls  on  a  host 
operating  system,  or  remote  procedure  calls  between  application  processes,  or  object-oriented  operations  on  a 
distributed  data  structure.  Asynchronous  notification  of  changes  in  the  envijonment  (e.g.,  signals,  upcalls,  or 
callbacks)  are  usually  restricted  to  special  cases  and  circumstances.  For  example,  under  UNIX  a  process  can 
receive  notification  that  more  data  is  available  in  a  file,  but  not  that  the  file  has  been  deleted,  renamed,  or  opened  by 
other  processes.  This  simplifies  programming  of  the  processes  that  are  not  involved  in  configuration  and  on-line 
management,  but  makes  it  difficult  or  impossible  to  write  a  manager  process. 


Chapter  3 


Protection  and  Control  Structure 
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3.  Protection  and  Control  Structure 

The  potential  for  change  raises  the  questions  of  who  may  change  a  thing  and  what  things  they  may  change. 
Answers  are  usually  given  in  terms  of  a  standard  protection  model  [Jon79],  in  which  there  is  a  collection  of  distinct 
objects,  with  operations  that  may  access  them  in  various  ways.  The  rights  to  perform  specific  accesses  on  specific 
objects  are  collected  into  sets  called  domains.  Privileges  are  distributed  among  active  entities,  known  as  principals, 
by  associating  a  principal  with  a  set  of  domains.  A  principal  may  carry  out  an  access  if  one  of  its  domains  contains 
a  corresponding  right. 

HPC  does  not  require  the  full  generality  of  the  standard  model.  An  object  in  the  standard  model  is  an  HPC 
structural  feature,  such  as  a  process,  shell,  or  interface.  HPC  does  not  distinguish  types  of  access,  focusing  on  rights 
to  specific  operations,  such  as  connect(viewl,  view2).  Domains  are  sets  of  rights,  as  in  die  standard  model,  and 
HPC  agents  are  principals. 

Policy  and  mechanism  arc  clearly  distinct  in  protection  issues.  A  protection  system  provides  a  mechanism  to 
enforce  access  control  or  information  control  policies  established  outside  the  system.  These  control  policies  arc  not 
arbitraiy’;  they  arc  based  on  the  management  of  objects.  Managers  set  policy  and  the  implementation  of  policy 
requires  action  and  change.  Agents  carrying  out  policy  must  be  empowered  to  make  the  necessary  changes. 
Conversely,  an  agent  should  not  possess  rights  it  does  not  need  in  order  to  implement  policy. 

Conventional  protection  systems  (access  control  lists,  capability  bsis)  must  allow  for  arbitrary  collections  of 
rights  (and  by  implication  objects)  because  there  is  no  other  mechanism  capable  of  expressing  the  relationships 
among  the  objects  in  the  system.  This  is  unnecessarily  general,  because  random  collections  of  objects  never  have  a 
common  manager,  while  related  objects  often  do.  A  contribution  of  HPC  is  the  exploitation  of  the  rich  and  explicit 
vertical  process  structure  in  the  definition  of  protection  domains.  To  date,  protection  mechanisms  associated  with 
hierarchically  organized  software  have  all  been  based  on  (static)  scope  rules  for  identifier  visibility. 

We  believe  that  "controls",  as  a  rebtionship,  as  important  as  "communicates  with".  It  should  be  possible  to 
build  sophisticated  control  behavior  out  of  less  complex  components,  w'ith  the  same  compositional  properties,  and 
benefits,  as  communication  baii^vior.  HPC’s  second  contribution  in  protection  structure  is  the  application  of  a 
powerful  mechanism  for  identifying,  composing,  debugging,  and  controlling  control  behavior.  These  functions  arc 
rarely,  if  ever,  available  in  conventional  protection  systems. 

Section  3.1  motivates  these  contributions,  building  on  the  static  domain,  agent,  and  controller  definitions 
presented  in  Chapter  2.  We  observe  several  interactions  between  the  proiecbon  system,  the  hierarchy,  and  the 
requirements  of  distributed  ^>plications  that  can  lead  to  internal  contradictions  and  sliow  how  they  have  been 
avoided  in  HPC. 

Secrion  3.2  conunues  the  investigation  of  these  interactions  with  the  preservation  of  strucuiral  invariants  by 
operations  on  structure.  Oeaiion  and  destruction  of  most  structural  features  interact  nicely  with  the  protection 
system,  but  direct  dom.tin  manipulation  and  process  creation  require  special  attention  to  avoid  violating  smictural 
constraints  and  to  limit  the  structural  damage  a  malicious  or  erroneous  agent  can  infiict. 

User  agent  control  of  a  domain  may  be  lost,  and  some  form  of  clean-up  or  recovery  action  must  take  place. 
Section  3.3  introduces  the  policies  that  the  HPC  system  can  be  asked  to  apply  automabcally  when  user  agents  arc 
upavailablc.  and  shows  how  losses  of  control  can  occur  cither  temporarily  and  permanently. 
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The  basic  HPC  proieciion  mechanism  does  noi  distinguish  among  types  of  access,  or  provide  for  arbitrar>' 
collections  of  rights.  Because  control  is  explicitly  composed,  user-defined  objects  can  implement  finer  or  more 
complex  access  policies,  and  transparently  extend  the  system  primitives,  without  additional  suppon  from  the  HPC 
system.  (Section  3.4.  i 

The  concluding  Section  looks  briefiy  at  some  classic  issues  in  protection,  such  as  amplification  and  revocation 
of  rights  in  the  context  of  HPC. 

3.1.  Static  Structure 

The  possible  agents  and  rights  of  any  HPC  protection  mechanism  can  be  easily  defined.  Processes  are  the 
primitive  active  components  in  HPC,  do  all  the  work,  and  thus  make  all  the  control  policy.  By  abstraction,  an 
arbitrary  object  should  be  able  to  do  anything  a  process  can  do,  making  objects  the  obvious  candidates  for  agents. 
Applications  are  described  in  terms  of  shells,  interfaces,  connections,  and  processes.  The  protection  mechanism  is 
to  control  the  rights  to  operate  on  these  structural  features. 

A  protection  system  also  incorporates  two  relations,  one  between  domains  a^'d  rights,  and  the  other  between 
agents  and  domains.  The  invariant  prop)enies  of  these  relations  are  the  interesting  features  of  a  system.  HPC  tightly 
restricts  the  rights  relation  to  follow  vertical  structure.  Some  necessary  properties  of  the  agents  .relation  can  be 
deduced  from  the  principles  of  abstraction  and  composition,  and  from  the  need  for  redundancy  and  robustness. 

3.1.1.  Rights  Relation 

Arbitrarily  contructed  domains  can  and  should  be  avoided.  The  grouping  defined  by  vertical  structure  gives  a 
strong  guideline  for  tlic  construction  of  domains.  Grouping  within  a  shell  shows  a  coherent  collective  activity.  If 
two  activities  don’t  interact  directly,  shells  should  be  used  to  make  their  boundaries,  and  independence.  manifesL 
When  an  applicidon  is  structured  properly,  the  contents  of  a  space  define  a  tight  composition  of  cooperating  objects 
that  should  be  managed  as  a  uniL  Therefore.  HPC  defines  domains  at  the  granularity  of  spaces.  The  views  incident 
on  a  space  and  the  connections  between  them  belong  to  the  same  domain. 

Vertical  structure  also  guides  the  clustering  of  spaces  into  domains.  We  do  rvot  expect  common  development 
and  management  of  arbitrarily  chosen  pieces  of  an  application.  However,  several  levels  of  related  absoactions  that 
interact  closely  are  often  managed  as  a  uniL  For  example,  a  program  module  may  include  many  functions  with 
independent  interfaces,  but  the  functions  in  a  module  am  ail  related. 

HPC  exploits  this  locality  in  constraining  domains  to  be  disjoint,  oootiguous  suberDes  of  the  dual  graph. 
Because  domain  contents  art  localized,  domains  are  readily  identified  and  traversed  on  the  basis  of  local 
information.  Every  space  knows  which  of  its  neighbors  art  in  the  same  domain,  and  there  is  no  need  for  an  explicit 
list  of  a  domain's  spaces.  Restricting  a  space  to  exactly  one  domain  induces  a  coarse  domain  graph  on  top  of  the 
dual  structure  graph.  This  is  an  elegant  relation  that  allows  acceptably  simple  operations  on  domains  and  is 
sufficient  for  our  target  applications. 

Other  cilices  of  domains  are  worth  further  study.  For  example,  a  nested  relation  on  domains  provides  a 
protection  model  analogous  to  conventional  programming  language  scope  rules:  inner  agents  (code)  can  affect 
enclosing  structure  (variables)  while  outer  agents  are  more  restricted.  Operations  on  nested  domains  probably 
would  be  much  more  complex  in  order  to  preserve  the  more  complicated  structural  invariants.  This  objection  would 
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noi  apply  to  ihe  most  general  proieciion  mcxlel  with  arbitrary’  domain  overlaps,  because  there  would  be  no 
constraints  to  presen'c. 

3.1.2.  Agents  Relation 

The  rights  relauon  has  remained  stable  throughout  the  HPC  project,  but  an  acceptable  relation  between  agents 
and  domains  was  more  difficult  to  achieve.  Two  signihcantly  different  versions  have  been  developed  since  the  HPC 
protection  system’s  introduction  in  fLcF85]  and  [LeF85). 

There  are  several  criteria  to  consider: 

•  Preserve  abstraction 

A  complex  domain  should  be  opaque,  indistinguishable  from  a  process. 

•  Redundant  control 

Robust  applications  must  have  redundant  agents  to  provide  robust  control,  even  if  they  have  no  other  redundant 
components. 

•  Positive  control 

Loss  of  control  should  be  prevented.  Every  domain  should  have  at  least  one  ager.L 

•  Control  structure 

Control  is  a  behavior  as  fundamental  as  communication,  and  should  be  subject  to  the  same  piinciples  of  manifest 
expression,  abstraction,  and  composition. 

•  Control  over  control 

Agents,  domains,  and  the  relation  between  them  are  not  static.  Changes  to  the  protection  relation,  as  well  as  process 
structure,  must  be  protected. 

•  Simplicity 

"Everything  should  be  as  simple  as  possible,  but  no  simpler."  -  A.  Einstein.  (Ironically,  restrictions  designed  to 
simplify  the  protection  system  were  at  the  root  of  many  early  problems.) 

Primitive  processes  are  opaque  in  HPC.  In  effect,  every  simple  space  (containing  a  process)  comprises  a 
domain,  acting  as  its  own  agent  This  interpretation  is  necessary  if  processes  and  (opaque)  arbitrary  objects  art  to 
be  treated  equivalently. 

Because  an  agent  may  be  a  complex  object,  it  is  natural  to  obtain  redundant  control  by  propagating  privileges 
to  more  than  one  of  its  components  (via  control  composition).  There  may  be  several  processes  distributed  across 
several  sites,  able  to  implement  the  agent’s  policy.  As  long  as  a  single  process  remains,  the  domain  Ls  under 
positive  control. 

It  15  tempting  to  permit  exactly  one  agent  for  each  domain,  using  complex  objects  to  obtain  multiple  agent 
processes.  That  restriction  leads  to  a  distinction  between  an  agent  objea  and  the  component  processes  that  are 
authorized  on  its  behalf.  Propagation  of  this  authority  was  a  principal  problem  with  one  version  of  the  pro;xtion 
system.  Another  reason  to  allow  multiple  agents  for  a  domain  is  dynamic  repiiacement  of  one  agent  with  aiK)Chcr. 
To  mainuin  positive  control,  a  new  agent  must  be  established  before  the  old  one  is  removed,  requiring  Mvo  agents  at 
least  momentarily. 
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Wc  can  also  show  ihai  an  agcni  must  be  allowed  lo  control  multiple  domains  to  preserve  the  abstraction  of 
robust  applications.  Consider  an  autonomous,  complex  object  as  an  agent  and  assume  that  agents  may  control  only 
one  domain.  (This  assumption  was  ilie  major  problem  with  the  other  earlier  version  of  the  protection  system.)  Call 
the  object,  0,  the  domain  it  controls,  E,  and  the  domain  of  its  internal  strucuire,  /. 


Figure  3.1.  Robust,  Opaque  Agent 

By  assumption,  the  subobjects  of  0  controlling  £  may  not  also  control  /.  There  must  be  a  separate  object  to  act  as 
agent  for  7.  Invoking  the  abstraction  principle,  autonomous  agents  should  be  allowed  for  /,  since  an  individual 
process  obviously  must  be  allowed.  If  0  is  robust,  its  internal  agent  must  be  robust,  therefore  redundant,  and  thus 
an  autonomous,  complex  objecL  This  leads  to  infinite  regress.  If  abstraction  is  to  be  applied  uniformly,  and  robust 
control  is  to  be  possible,  some  agent  must  be  allowed  control  of  more  than  one  domain.  Specifically,  it  must  be 
allowed  control  of  itself,  as  well  as  an  external  domain.  The  interpretation  of  processes  as  their  own  agents  leads  to 
a  similar  argument:  Some  of  the  leaf  processes  of  a  complex  agent  must  be  agents  for  the  domain  it  controls,  as 
well  as  their  own  domains,  because  only  processes  can  actually  do  work. 

Wc  investigated  several  rules  for  propagating  the  privileges  of  an  agent  object  to  its  components,  and 
concluded  that  a  complex  agent’s  control  behavior  should  be  defined  using  the  same  tools  as  a  complex  object’s 
communication  behavior.  Propagation  rules  similar  to  programming  language  scope  rules  do  not  allow  empowering 
only  selected  components.  The  hierarchical  direction  of  propagation  leads  to  unnecessary  technical  complexity  and 
restrictions,  and  fails  to  preserve  abstraction  by  distinguishing  non-hicrarchical  structures  from  opaque  subtrees. 
Positive  control  could  be  tested  only  by  examining  an  entire  subtree  for  live  processes.  Similarly,  the  full  protection 
relation  would  not  be  manifes*  froni  looking  at  local  structures. 

3.U.  Controllers 

Composition  of  communication  in  HPC  provides  a  clean  implementation  of  a  complex  agent's  behavior  by 
components,  manifest  relationships,  dynimicilly  and  incrementally  defitiOd  paths,  no  hienrchical 
restriction,  and  the  ability  to  debug  at  several  levels  of  abstnetion.  The  problems  with  hierarchical  control 
propagaijon  strongly  suggest  using  the  same  tools  to  compose  control  Agent  processes  are  at  one  end  of  control 
paths  arid  controUers  were  introduced  to  provide  a  explicit  destination  for  such  paths. 

There  is  no  way  to  disconnect  and  reconnect  an  interface  atomicaJiy.  For  an  agent  to  replace  itself  with 
another,  the  ricw  agent  must  be  connected  lo  the  controUcr  before  the  old  one  is  disconnected.  This  requires  either  a 
multiplex  inittfacc  on  the  conirollcr.  or  a  multicast  interface  somewhere  along  the  path,  preferably  on  the  controller 
.so  that  direct  connections  can  be  replaced. 
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A  special  rPC  mechanism  named  control  is  used  as  the  siruciiire  of  simple  controller  views,  and  controller 
objects  present  an  interface  with  the  following  partial  specification. 

i  rclc:  "H?C  ser^ioc", 
type:  "ru-iicas*.  apr.-ci", 
exterr.al;  erepetrt, 
strjct'one:  nulticast 
i  role:  "rPC  cor.trclicr", 
type:  "rPC  ir.wccotions", 
external:  exter.sir-, 
structure;  cor.trt* 

Control  streams  from  the  agents  are  merged  into  a  single  stream  into  the  controller,  and  notifications  from  the 
controller  are  replicated  for  all  the  agents.  Checking  for  positive  control  uses  the  mechanism  for  detection  of  end- 
to-end  communication  paths. 

Agent  privileges  propagate  across  domain  boundaries  along  oorr.roi  paths.  There  is  no  limit  on  the  number  of 
agents  that  may  be  connected  to  a  controller,  either  directly  or  as  multicast  components  inside  a  directly  connected 
agent  Likewise,  an  object  can  have  any  number  of  control  interfaces  and  can  be  agent  to  several  domains.  Activity 
on  behalf  of  different  domains  is  explicitly  distinguished  by  using  a  separate  interface  for  each  stream  of  control 
messages. 

3.2.  Preserving  Invariants 

There  are  three  ways  protection  relations  can  change.  Agents  associated  with  i  domain  can  change,  domain 
rights  can  change,  and  domains  can  be  created  and  destroyed.  Agents  are  defined  by  communication  paths  to 
controllers,  and  operations  on  connections  or  processes  may  affect  the  agent  relation  as  side  erfecis,  but  special 
aaention  is  required  only  when  the  last  agent  for  a  domain  is  removed  (to  ensure  positive  control) 

The  direct  effects  of  most  HPC  operations  on  the  protection  relations  are  minimal.  Every  view  or  other 
structural  feature  is  always  associated  with  one  domain.  When  a  feature  is  created  or  destroyed,  rights  to  operate  on 
it  a^e  added  or  removed  from  its  domain.  However,  not  all  rights  belong  to  a  domain.  Some  types  of  access  may 
require  simultaneous  access  to  multiple  features,  for  example  disclose  requires  access  to  both  sides  of  a  shell.  A 
domain  contains  the  right  for  a  multiple  access  only  when  all  the  feaiurcs  accessed  are  in  the  domain. 

Eiomain  creation  and  destrucuon  have  more  powerful  effects.  Only  these  operations  affect  the  ptiticioaing  of 
feauires  into  domains.  A  feature  is  created  in  a  well-defined  domain,  and  it  remains  there  until  the  domain  is 
destroyed  or  a  new  domain  is  created  arotmd  it.  Rights  cm  be  added  lo  a  domain  only  by  creating  new  feamres  or 
by  destroying  another  domain  Howeva.  the  effocLs  of  absoict  process  creation  and  destruction  tove  the  gitatest 
elTcci  on  all  forms  of  structure,  including  domains. 

There  arc  three  structural  invariants  tint  may  be  vtoiaied  by  kiUAlie  and  depoaa^bdkate.  First,  every 
process  must  be  directly  encapsulated  within  a  domain  boundary.  Merging  a  simple  domain  with  the  neighboring 
domain  above  it  vioiaics  this  invanint  by  exposing  a  ’raw*  process.  We  cannot  preclude  this  situation  by  natricting 
depose  to  complex  domains,  because  this  would  violate  abstraction.  However,  by  interpreting  a  process  as  its  own 
controller,  it  will  be  removed  automaucally  when  its  domain  is  destroyed.  Destruction  of  the  superior  domain  is 
prohibited  by  an  asymmetric  constraint  intrtxliKrcd  below. 
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Second,  ever>’  domain  must  have  one  controller.  When  kill  is  extended  to  arbitrar>'  subtrees  for  convenience, 
it  is  possible  to  remove  a  controller,  v/ithout  removing  its  entire  domain.  This  could  be  avoided  by  doing  without 
the  convenient  extension.  Alternatively,  the  operation  could  be  restricted  by  an  additional  precondition.  HPC 
preserves  the  invanani  by  destroying  a  domain  as  a  side-effect  of  removing  its  controller.  Any  remaining  contents 
are  abdicated  to  a  neighbonng  domain. 

The  HPC  system  distinguishes  the  root  space  and  acts  as  agent  for  a  domain  consisting  of  just  that  space.  If  a 
top-level  domain  merges  with  the  root,  HPC  kills  the  subt^  and  removes  its  shell  to  restore  the  desired  root 
structure. 

The  naive  definition  of  kill  and  die  replaces  the  subtree  on  one  side  of  a  shell  with  a  single  empty  space.  The 
subtree  that  includes  the  root  space  also  include  all  the  top-level  applications.  Clearly,  it  should  not  be  possible  to 
destroy  these  from  arbitrary  places  in  the  hierarchy.  Similar  considerations  apply  to  depose.  Repeated  depositions 
would  give  an  agent  control  over  all  proce,ss  structure. 

Hierarchical  organization  makes  strong  assumptions  about  the  control  privileges  of  superior  and  inferior 
levels.  Components  are  usually  considered  implementing  modules  chosen  by,  and  subordinate  to,  their  parent  A 
supenor  level  is  expected  to  have  the  privileges  to  create  and  destroy  inferiors,  while  inferiors  arc  expected  to  have 
no  control  over  superiors. 

A  single  asNTnmciric  constraint  based  on  the  hierarchy  avoids  interference  between  top-level  applications  and 
disruption  of  a  superior  domain  by  an  inferior.  An  agent  may  depose  or  kill  only  inferior  domains,  and  may 
abdicate  or  die  only  to  the  (unique)  superior  domain.  Given  this  restriction,  a  single  operation  suffices  for  each  of 
the  depose/abdicate  and  kill/die  pairs.  These  unified  operations  could  be  applied  by  agents  on  either  side  of  a 
domain  boundary  to  remove  the  inferior  domain  or  subtree.  (The  HPC  implementation  uses  such  operations 
internally,  but  the  pairs  of  distinct  operations  are  retained  in  the  application  interface  to  increase  the  chances  of 
detecting  errors.) 

3J.  Terminal  Policy 

An  agent  controls  a  domain  when  it  has  a  logically  complete  and  physically  implemented  communication  path 
to  the  domain’s  controller.  There  are  four  ways  to  lose  control  of  a  dcniain. 

•  An  agent  causes  the  domain  to  be  destroyed. 

The  domain’s  agents  arc  forcibly  taken  out  of  control  by  the  (possibly  distant)  side-effects  of  another  agent. 

•  All  paths  from  the  controller  lenninate  within  the  domain. 

Only  i  connected  agent  can  make  a  new  ooruiection  inside  the  domain,  and  an  agent  can  be  added  only  by  making 
one.  Accordingly,  if  all  the  paths  to  the  controller  are  broken  inside  (he  domain,  control  has  been  permanently  lost 

•  None  of  the  paths  from  the  controller  into  other  domains  is  complete. 

If  some  paths  pass  into  other  domains,  but  none  arc  complete  (there  arc  no  processes  at  their  ends),  control  has  been 
lemporarily  Iosl  Agents  for  other  domains  might  complete  a  pauh  end  restore  control. 

•  None  of  the  complete  paths  from  the  controller  arc  (^ysically  viable. 

When  no  logically  connected  agent  process  is  physically  reachable  due  to  a  partition,  control  has  been  tcmpararily 
lost.  When  the  panuion  ends,  control  may  be  restored. 
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The  HPC  system  normally  will  do  nothing  without  an  agent.  This  is  a  (trivial)  null  policy.  It  is  convenient  to 
specify  a  more  complex  terminal  policy  to  be  applied  directly  by  the  HPC  system  during  temporary  or  permanent 
losses  of  control.  Such  policies  smuld  be  very  simple. 

Domain  structure  controls  visibility  as  well  as  other  access.  Autonomous  applications  are  opaque;  their 
internal  structure  is  completely  hidden  and  complex  applications  can  not  t>c  distinquished  from  single  processes. 
This  opacity  is  preserved  by  die  and  kill,  because  the  internal  structure  is  always  removed  before  the  domain  is 
merged  into  its  superior.  However,  depose  removes  control  from  the  application  and  reveals  its  internal  structure. 
Some  applications  may  w'ish  to  hide  their  implementation  under  all  circumstances.  By  specifying  die  as  the 
terminal  policy  for  permanent  losses  of  control,  an  agent  can  ensure  the  privacy  of  its  internal  structure. 

Policies  stronger  than  null,  but  less  drastic  than  die,  must  be  applied  to  avoid  subversion  of  access  control, 
consistency  control,  and  similar  user  policies  during  a  tcmporaiy  loss  of  control.  For  example,  orphaned 
transactions  should  not  be  allowed  unsupcivised  interactions  with  other  processes.  The  suspend  policy  stops  all 
communication  passing  through  the  domain.  Even  communication  between  processes  unaffected  by  the  physical 
partition  is  halted  by  forcing  all  interfaces  across  the  domain  boundary  to  the  suspended  state.  (See  Chapter  4.) 
Suspension  lets  component  objects  continue  Lniemal  processing,  but  prevents  any  interactions  between  them. 

Giving  control  to  another  user  agent  is  also  a  suitable  response.  The  system  could  animate  a  new  agent 
object  and  connect  it  to  the  controller,  or  abdicate,  merging  the  domain  with  its  parent  in  the  hierarchy.  Trying  to 
create  a  new  agent  introduces  complexity.  When  this  policy  is  selected,  the  HPC  system  must  record  the  desued 
agent  parameters  and  a  fallback  policy,  as  it  may  be  impossible  to  create  the  necessary  agent  processes. 

A  complete  terminal  policy  is  a  sequence  of  the  basic  policies  null,  suspend,  abdicate,  die,  and  anu-natc 
(with  parameters).  The  policy  sequences  for  temporary  and  permanent  losses  of  control  arc  independently  spccih'd 
by  a  domain’s  agent  When  a  loss  of  control  occurs,  the  HPC  system  applies  each  basic  policy  in  turn  until  one 
succeeds.  (The  first  four  policies  cannot  fail.)  A  default  policy  is  applied  if  the  sequence  is  exhausted.  The  defaults 
arc  abdicate  for  permanent  losses  of  control,  and  suspend  for  temporary  losses. 

The  HPC  system  will  not  accept  permanent  responsibility  for  any  domain,  so  null  or  .suspend  are  not 
permitted  in  the  terminal  policies  for  permanent  losses  of  control.  However,  any  basic  policy  is  permitted  for 
temporary  losses. 

3.4.  Policy  Filters 

Every  control  message  requesting  c  kgil  operation  that  arrives  at  a  controller  is  acted  on.  The  HPC 
protection  mechanism  docs  not  distinguish  among  agents,  or  among  the  legal  operations,  and  therefore  providc.s 
nothing  like  "read  only"  or  "restricted*  access  to  a  domain.  How'cvcr.  arbiiranly  complex  and  sophistkaicd  access 
control  policies  can  be  implemented  without  extending  the  basic  mechanism 

Because  command  invocations  are  explicitly  modelled  as  messages  m  a  stream,  they  can  be  monitored  and 
filtered.  A  trusted  policy  filter  agent  can  be  interposed  between  the  controller  and  a  restricted  agent  with  limited 
privileges.  (Figure  3.2.)  It  forwards  authorized  control  messages  from  the  paiiially  trusted  agent  lo  the  controller, 
and  rejects  unauthorized  invocations.  Noiificiticms  from  the  controller  arc  forwarded  m  the  opposite  direction 
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Figure  3.2.  Policy  Filler 

A  policy  fiUcr  presents  ihe  same  abstract  control  interface  as  a  controller,  although  it  docs  not  have  the  same 
intrinsic  robustness  and  availabiUty  properties.  Controllers  are  placeholders  for  the  KPC  system  and  can  be  handled 
spcciall) ,  v/hilc  policy  filters  are  ordinary  objects  and  can  fail  independenUy  of  the  HPC  system.  A  reliable  filler 
must  be  a  complex  object  wid.  internal  replication,  multiplexed  deliveiy  of  incoming  messages,  and  coordination 
between  components  to  avoid  multiple  deliveries  of  the  same  operation  to  the  conircllcr. 

Read-only  access  is  easily  provided  by  authorizing  only  inquire  operations  More  general  forms  of  access 
control  require  checking  the  arguments  of  an  invocation  against  a  list  of  accessible  structure.  For  example,  it  might 
be  desirable  to  restrict  an  agent  to  managing  the  connections  among  a  specific  group  of  interfaces.  The  policy  filler 
to  enforce  this  restricuon  would  forward  those  invocations  whose  arguments  are  on  the  specified  Ust,  and  reject  any 
others.  The  conuoUer  wiU  reject  lUcgal  operations,  so  the  policy  filler  doesn’t  need  to  make  other  explicit  checks. 
Only  nouficauo.-.s  concerning  the  specified  interfaces  would  be  passed  back  to  the  resmeted  agent. 

Policy  fillers  need  not  tell  the  truth.  Foi  example,  inquire  cm  provide  mformaiion  about  the  parent  and 
children  of  an  interface,  some  of  which  might  be  inaccessible.  A  stricter  version  of  the  fil  er  discussed  above  would 

tell  the  restricted  agent  only  what  it  needs  ic  know'  by  modifying  noiificaiJous  to  delete  references  to  inaccessible 
interfaces 

A  policy  filler  Ui*.  ciasc  all  references  to  itself  from  notifications,  repon  all  corLicciions  to  itself  as 
connections  directly  to  the  controller,  and  translate  all  operations  involving  me  controller  into  operations  on  itself. 
This  technique  pro  idcj  a  transparent  illus  on  of  unfihered  control,  and  perm;,  user-level  extensions  to  the  HPC 
system.  Most  generally,  a  (hidden)  policy  filter  can  onnslate  abstmeuons  of  new  structures  and  operations  into 
concrete  HPC  fr»tures  the  same  way  the  HPC  system  tanslates  ahstnci  hicra.rchff  s  into  concrete  host  processes  and 
media.  The  new  ab$triciK>ns  need  rroi  have  any  strong  relation  lo  HPC  at  all 

A  jourr.aJ  of  invocatio.-f  and  their  sources  is  one  useful  extension  tlui  requires  no  change  to  the  HPC 
interface.  A  jourr  l  would  allow  transparent  debugging  and  enforcement  of  audu  trails  An  extension  that  involves 
only  a  small  change  in  the  svstem  interface  would  augment  the  host  idenufic.^s  civrn  m  animate  (f/us  host  and 
specific  host  X)  with  Medusa-style  location  speafiers  like  sumr  host  as  process  /  vro  far  from  process  P,  and  near 
p-rocess  P  but  d^fereni  host  (Ous8l  j.  The  hidden  policy  filter  would  cvaluaur  Lhr  arguments  of  extended  animate 
calls,  perhaps  with  the  help  of  a  resource  management  uliUiy.  and  then  invoke  ha.sic  HPC  animate  operation 
with  specifi:  hosts.  All  other  control  messages  would  be  forwarded  wuhoui  cha' 
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Hidden  policy  fillers  are  »iie  naiural  inierface  between  the  HPC  system  and  imponani  adminisiraiivc  funcuons 
and  access  control  features  outside  the  system,  such  as  native  protection  systems,  charging,  resource  quotas,  and 
classes  of  service.  While  the  HPC  system  has  no  concept  of  authentication  or  user  identity,  and  will  simply  repon 
’’process  creation  failed”,  an  extended  system  could  add  additional  properties  such  as  (identity),  new  operations  such 
as  (login  as  user  X),  and  extended  reponmg  ("failed  due  to  insufficient  fun'is/privilcgcs/rcsources”). 

The  flexibility  and  power  of  policy  fillers  is  available  only  when  the  basic  inierface  to  system  facilities  can  be 
intercepted  and  transparently  replaced.  Most  system  do  not  make  this  possible.  The  Accent  operating  system  is  one 
exception.  An  Accent  kernel  port  is  a  system  inierface  similar  to  the  HPC  controller,  but  interception  must  take 
place  when  the  partially  misted  process  is  created,  or  depend  on  that  process  to  cooperate  when  reconfiguring. 

3.5.  Classical  Protection  Issues 

Security'  policies  are  usually  divided  into  access  control  policies  and  information  control  policies,  and  there 
are  classical  questions  concerning  the  ability  of  a  protection  model  to  express  and  enforce  them. 

Because  process  state  and  the  content  of  interprocess  communication  are  outside  the  HPC  system,  HPC  can 
not  address  the  basic  information  control  issues,  which  are  modifiration  and  spoofing  (how  change  lo  data  is 
conuolled),  and  retention  and  confinement  (how  propagation  of  data  is  c-ontrcllcd). 

The  primary  access  control  issues  are  propagation  and  conservation  (how  privileges  arc  b.^nsfexTcd), 
revocation  (how  privileges  arc  removed),  and  mutual  suspicion  and  amplification  (how  two  agents  can  graiM  each 
other  only  selective  access).  These  problems  have  been  examined  most  closely  in  protection  systems  that  allow 
transfer  of  rights  between  domains  (c.g.,  capability  systems).  The  KPC  domain  system  has  quite  different 
properties,  because  the  rights  in  a  domain  are  fixed,  but  access  can  be  propagated  and  filtered  without  affecting  the 
domains.  The  rights  relation  is  modified  in  capability  systems,  while  the  agent  relation  is  modified  in  HPC. 

Amplification  of  rights  is  an  imponani  issue  in  capability  systems  where  ihc  owner  of  an  object  instance  may 
not  operate  on  it,  while  the  manager  for  the  object  type  docs  not  have  any  rights  for  the  instance.  V'hcn  the  owner 
passes  the  object  to  its  manager  as  a  parameter,  the  manager  is  temporarily  granted  full  rights  to  the  object 
Amplification  has  no  analog  in  HPC  because  objects  are  never  passed  as  parameters,  rights  are  rtcver  transferred 
between  domains,  and  there  are  no  user-defined  types. 

HPC  privileges  at  any  moment  are  determined  by  the  composition  of  agents  and  controllers.  Connections  and 
policy  filters  provide  explicit  propagation,  mutual  suspicion  through  filtering,  and  immediate  revocation)  of  accesc  by 
disconnecting  an  agent  However,  there  is  no  way  to  restria  a  partially  misted  agent  from  propagating  its  current 
access  furtlicr.  This  is  consistent  with  the  basic  stnictiiring  principles.  A  sm^g  conservation  facility'  would  violaic 
absaraction  by  preventing  a  complex  agent  from  implementing  its  control  services  any  way  it  chooses. 
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4.  Communication  Structure 

Connecuons  and  inierfaces  are  the  HFC  structural  features  that  tie  processes  and  objects  together  into  useful 
applications.  Explicit  communication  is  the  only  form  of  direct  inuraciion  between  processes  in  HPC.  Inierfaces 
provide  abstract  destinations  so  that  a  process  does  not  need  to  know  the  location  or  identity  of  its  partners  in 
communication,  and  the  actual  destinations  are  determined  by  chains  of  coruiections  joined  cnd-io-end  at  interfaces. 

Because  the  possible  patterns  of  interaction  are  expressed  and  limited  by  the  available  communication 
structures,  HX  extends  the  intuitive  one-to-one  channels  (simple  interfaces)  to  multiple  parallel  channels  (bundle 
and  multiplex  interfaces),  ttiid  to  many-to-many  communication  patterns  (multicast  interfaces).  To  illustrate  the  rich 
patterns  that  HPC  can  describe,  we  begin  this  Chapter  an  HPC  specification  for  a  replicated  remote  procedure  call 
system  taken  from  the  literature. 

Section  4.2  ngorously  defines  the  significant  communication  paths,  replacing  the  intuition  of  sequences  of 
connections  and  incorporating  the  effects  of  complex  interfaces.  Section  4.3  presents  the  view  properties  intended 
for  use  by  agents  to  manage  structure  in  their  domains. 

There  are  several  notable  interactions  between  apparently  independent  features  of  communication  structure, 
abstraction,  and  IPC  implementations.  For  example,  the  information  hiding  provided  by  shell  abstractions  can  be 
partially  defeated  by  interiace  propierties,  and  the  hierarchical  interpretation  of  nested  objects  is  poor  for  labelling 
the  direction  of  communication  flov. .  Multicasting  offers  yet  another  set  of  problems.  These  issues  are  discussed  in 
Section  4.4. 

Section  4.5  concludes  with  a  discciion  of  communication  into  three  functions  that  arc  quite  distinct  in  HPC, 
and  a  comparison  with  the  ISO  reference  model. 

4.1.  Example:  Circus  Replicated  RPC 

The  Circus  system  extends  the  Courier  RPC  mechanism  to  groups  of  replicated  processes  [Coo84].  Circus 
will  serve  as  a  good  demonstration  of  the  expressive  power  of  communication  structure  and  hierarchical  process 
composition  in  HPC. 

In  Circus  icrminology,  a  replicated  group  of  processes  is  caDcd  a  troupe.  All  members  of  a  troupe  are 
functionally  equivalent.  They  may  run  at  different  speeds,  and  have  diffcrml  internal  states,  but  must  execute  the 
same  sequence  of  RPC  operations.  An  RPC  call  from  any  member  of  a  client  troupe  is  replicated  to  every  member 
of  (he  server  troupe.  Similarly,  a  reply  is  replicated  to  all  of  the  calkrs.  Code  in  the  Circus  RPC  Ubrary  counts  the 
number  of  requests  or  replies  and  applies  various  redundancy  policies,  like  majority  voting  or  quorum  consensus,  ro 
decide  a  valid  replicated  call  has  taken  place.  A  special  croupe,  called  the  ringmaster,  monitors  the  numbci  of 
members  in  each  troupe,  which  may  vary  dynamically,  and  makes  this  infonnation  available  to  the  RPC  library  so  it 
can  check  for  the  required  degree  of  redundancy. 

The  Circus  communication  pattern  can  be  expressed  using  a  combination  of  multiplex  and  multicast 
interfaces,  and  troupes  can  be  composed  using  a  single  connection.  Consider  Figure  4.1,  which  show^  a  single 
cUcni  coinccicd  to  a  multiplex  server.  Partial  specifications  for  ihi*.  client  and  server  intcnaccs  arc: 
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c>;  "f'jrvrtic’''  s:.jr”, 

'.rjc"jrc;  si*^'le  COi^-ier  c^x*'". 


irjci-x*;  ru-.x.ov 


rcle:  '‘fur'rxcr  e':r,-, 
s-.rur.un?:  s^-r-c-  Cc-'.-.: 


Figure  4.1.  Simple  Client  and  Server 

Circus  troupe'',  are  modeled  as  HPC  objects,  iheii  member  processes  aie  simple  objects  within  the  troupe, 
replication  of  RPC  calls  requires  one  minor  change  to  the  specification  of  individual  troupe  members.  The 
librar>'  expects  sequences  of  calls  rather  than  single  calls,  so  we  will  use  the  type  r«piicatcd-co*jrier  raU'^er 
Courier.  The  relevant  specification  of  cbeni  troupe  member  interfaces  is  thus: 

(  role;  "firctian  stv’b". 
irrjcrure:  sirpie  replicaiec-Ccwrier  clie“: 


For  server  troupe  members  it  is: 

[  structure;  rrultiple>. 

(  role:  "firxxicr. 

structure;  sirpie  replicatea-Courier  server 

} 

1 


Each  troupe  replicates  communication  for  each  of  its  members,  so  the  interfaces  for  each  troupe  should  be 
multicast.  The  interface  for  the  overall  client  troupe  is  specified  as: 


1  role:  "trupe  stit", 
structure:  nulticist 
(  role;  "furvetien  stv±>", 
structure:  sirpie  reolicated-Courier  client 

1 

1 

Each  server  cotaponeni  must  be  compatible  with  this  cbent  structure,  and  the  server  must  multiplex  its  service  to 
many  clients,  so  the  specification  for  Ute  entire  servet  troup  interface  is  a  multiplexed,  mulucast  RPC  interface. 


1  structure:  irultiplex 
1  role:  "trespe  ertr)", 
structure:  nulticsr. 

I  role:  “functicn  entry", 
structure:  sirpie  replicsteii-Courier  server 


When  a  new  client  is  added  to  the  server,  components  arc  created  on  both  sides  of  the  server’s  main  multiplex 
interface.  (These  components  arc  multicast  views.)  The  manager  responsible  for  the  internal  structure  of  the  server 
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troupe  creates  one  component  of  the  internal  multicast  view  for  each  individual  ser\'cr  process.  (These  components 
are  replicated  RPC  views.)  It  then  takes  each  of  the  individual  server  processes  in  turn  and  creates  a  new  replicated 
RPC  component  for  its  main  multiplex  interface.  Connections  arc  then  created  between  the  external  RPC  view's  of 
the  server  processes  and  the  internal  RPC  view  s  of  the  troupe. 

A  server  troupe  with  two  members  and  a  client  troupe  with  three  members  are  illustrated  in  Fipure  4.2.  The 
troupes  in  this  Figure  have  the  same  relationship  as  the  individual  objects  in  Figure  4.1.  In  both  cases,  a  multiplexed 
server  with  components  for  three  clients  is  shown  servicing  one.  The  only  difference  is  that  the  client  troupe 
interface  is  a  multicast  RPC  interface,  instead  of  a  simple  RPC  interface,  and  the  server  is,  of  course,  compatible. 
The  individual  client  and  ser\'er  processes  have  the  same  interfaces  as  before,  except  for  the  expected  replication  of 
calls. 


Figure  4.2.  Circus  Replicated  Client  and  Server 

Using  HPC  multiplex  and  multicast  views.  Circus  troupes  can  be  created  using  almost  unmodified 
conventional  clients  and  servers.  Any  degree  of  replication  within  a  troupe  is  supported,  and  grea'er  flexibility  is 
possible  than  in  the  original  Circus  structure,  because  server  troupes  can  chose  to  assign  differing  sets  of  server 
processes  to  service  different  clients,  by  controlling  the  server  processes  connected  to  a  given  internal  multicast 
view. 

Since  the  ringmaster  is  just  a  treupe,  albeit  a  .special  one,  the  same  communication  structures  could  be  used  to 
communicate  between  troupe  members  and  the  ringmaster.  A  typical  clientAerver  relationship  (especially  for 
global  servkes  like  the  ringmaster)  would  be  implemented  using  these  multiplex  and  multicast  interfaces  on  splices 
across  the  hierarchy,  rather  than  through  connecuons  between  clients  and  the  server. 

4.2.  Significant  Communication  Paths 

Earlier  presentation  of  communication  paths  leaned  heavily  on  inuiiiion  to  simplify  discussion.  At  the 
expense  of  some  additional  terminology,  wc  present  hcic  the  remaining  details.  The  most  important  deuil  is  the 
recursive  definition  of  cnd-io-end  chains,  which  arc  the  only  communication  paths  where  action  at  one  end  can  be 
reflected  at  the  other.  The  concepts  of  cndpoint/cxiension  and  of  corresponding  components  arc  needed  for  iliis 
cnjcial  dcfiniiion. 
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4.2.1.  Endpoints  and  Extensions 

The  vie'v.  s  at  the  ends"  of  a  communication  path,  and  those  '  in  the  middle"  are  distinct.  The  send  operation 
can  be  sensibly  applied  to  the  former,  while  connect  makes  sense  for  the  latter.  We  call  them  endpoint  and 
extension  viev^  s,  respecuvely.  Every  view  is  created  as  one  or  the  other,  and  it  retains  that  property  throughout  its 
lifetime.  The  fixed  distinction  between  endpoints  and  extensions  improves  abstraction,  simplifies  the  system 
implementation,  and  eliminates  a  class  of  inconsisr  ncies  due  to  network  partition. 

Extensions  are  related  to  endpoints  somewhat  the  same  way  opaque  abstract  data  types  are  related  to  their 
concrete  representations.  Primitive  behaviors  are  implemented  at  endpoints  in  terms  of  message  contents,  but  the 
behaviors  are  composed  and  combined  at  extensions  without  access  to  the  internal  contents  of  the  communication. 
Specifically,  operations  like  send  and  receive,  and  the  HPC  primitives  new  and  delete,  can  be  invoked  only  on 
endpoints,  while  connect  and  disconnect  can  be  invoked  only  on  extensions. 

Additionally,  while  both  endpoints  and  extensions  may  be  complex  (with  component  stnicturts),  component 
views  arc  elaborated  only  at  endpoints.  The  hidden  component  structa~c  of  an  endpoint  is  said  to  be  masked. 
Consider  the  Circus  server  mterface  given  earlier  in  more  detail.  (Figure  4.2.) 

[  internal:  enc^ir.t, 
external: 

structure:  mltiplex 
[  internal:  endpcir.t, 
exterral:  exter.sic.'., 
structure:  multicast 
(  interrvu:  extensior., 

exterml:  naskec. 

structure:  surple  replicateci-Courier  server 

J 

1 

There  are  three  levels  to  this  interface  tree  structure.  At  the  top  level,  the  multiplex  views  on  both  sides  of  the  shell 
are  endpoints,  and  their  components  art  visible.  Each  of  these  components  is  a  multicast  view,  but  the  external 
views  a>*c  extensions  while  the  internal  ones  are  cndpwinis.  On  the  outside,  this  is  the  lowest  accessible  level  of  the 
view  hierarchy  because  the  third  level  structure  is  masked,  while  inside,  the  third  layer  is  available  as  simple 
extensions  of  the  multicasi  views. 

Besides  the  abstraction  benefits,  allowing  messages  or  procedure  calls  only  at  views  witit  certain  fixed 
properties  eliminaies  the  need  to  implement  media  for  all  communication  paths.  Oily  a  subset  of  paths  terminate  in 
two  simple  endpoints,  and  only  this  subset  requires  the  manipulation  of  physical  media.  In  fact,  we  go  further  and 
implement  media  only  for  paths  where  both  endpoints  are  internal  views  of  real  processes,  and  when  both  proo^sses 
have  expressed  an  interest  in  actually  using  their  endpoints.  This  allows  the  HI’C  system  to  prepare  the  transport 
media  used  by  a  given  process  at  convenient  times. 

It  also  means  that  a  complex  object  can  not  send  a  message  directly,  enforcing  passive  hieiirchies.  (An  agent 
could  perfectly  well  send  on  any  interface  in  its  domain  without  this  restriction.)  All  communication  is  perfonned 
by  simple  pracesses  and  the  connections  inside  a  complex  object  determine  which  subobjecis,  tnd  ultimately 
processes,  communicate  on  its  behalf. 

A  chain  would  implicitly  multicast  to  all  points  along  it  without  the  restriction  of  dclivcrv  to  endpoints.  Hic 
situation  for  ^jcnding  is  analogous.  Since  most  paths  arc  one  lo-onc.  the  distinction  between  endpoints  and 
extensions  avoids  unwanted  generality  (and  implementation  complexity),  requiring  explicit  introduction  of 
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muliicasimg  when  ii  is  desired. 

Finally,  a  class  of  possible  inconsistency  due  lo  network  partition  is  eliminated,  and  the  reconciliation  rules 
are  simplified  accordingly.  Because  the  endpoini/extension  property  is  fixed  for  a  given  view,  there  will  never  be  a 
conflict  between  its  use  with  abstract  connections  (extension)  or  with  U'ansport  media  (endpoints).  All 
inconsistencies  in  paihs  can  be  reduced  to  a  single  type  of  illegal  structure:  multiple  connections  to  a  single  view. 

4.2.2.  Implicit  New  and  Corresponding  Components 

Paths  invohing  complex  interfaces  represent  several  component  paths,  and  it  is  important  to  keep  track  of 
w'hich  component  views  at  one  end  are  associated  with  which  component  views  on  the  other.  Bundles  have  a  fixed 
number  of  components,  and  they  arc  associated  in  the  obvious  way.  Multicast  interfaces  have  dynamically  created 
components,  but  they  all  represent  the  same  communication  channel.  However,  each  multiplex  component 
represents  a  distinct  channel  and  it  is  necessary  to  identify  the  other  end  of  the  channel. 

It  would  be  pointless  for  the  new  primitive  to  create  a  unique,  one-ended  channel.  There  would  never  be  any 
way  to  communicate.  Instead,  under  certain  circumstances  new  creates  views  for  both  ends  of  the  new  channel. 
WTicn  two  multiplex  endpoints  are  joined  by  a  complete  path,  a  new  component  is  created  for  both  endpoints. 
(Because  of  muliicasung,  components  may  be  created  for  more  than  one  remote  multiplex  endpoint.)  This  is  one  of 
the  two  cases  where  an  HPC  primitive  can  have  a  significant  non-local  effect  (The  other  is  splicing  to  a 
promiscuous  sen  ice. i  The  delete  primitive  affects  only  its  argument. 

We  can  now  define  the  useful  notion  of  corresponding  components.  Correspondence  for  bundle  views  is 
determined  by  the  fixed  order  of  the  component  structures;  the  nth  component  of  one  bundle  corresponds  to  the  nth 
component  of  another.  Given  two  compatible  multicast  views,  all  components  of  one  correspond  to  all  components 
of  the  other.  Correspondence  for  multiplex  views  depends  on  the  order  in  which  the  component  views  were  created; 
a  multicast  component  corresponds  to  just  those  components  that  were  created  by  the  same  invocation  of  new, 

4.23.  Peers  and  Chains 

Vit'ws  that  have  been  bound  together  as  a  Link  in  a  communication  path  are  called  peers.  There  are  links 
through  connections  and  links  through  shell  boundaries.  Views  bound  by  a  connection  are  called  public  peers,  since 
the  binding  is  manifest  whenever  the  views  are  visible.  Views  bound  through  shell  boundaries  are  called  private 
peers,  because  the  binding  is  not  always  known,  even  to  the  owners  of  the  views,  due  to  protection  and  visibility 
boundaries.  A  communication  path,  or  chain,  is  a  sequence  of  alternating  private  and  public  peers. 

End-to-end  chains,  which  icrminaie  at  endpoints,  are  the  most  important  When  their  smjcture  is  simple,  they 
must  be  implemented  widi  transport  media.  When  their  structure  is  multiplex,  they  allow  implicit  creation  of 
components.  We  often  call  the  terminal  pair  of  endpoints  end-iO'Cnd  peers.  In  Figure  4.3  A  and  f)  arc  endpoints, 
and  B  and  C  arc  connected  extensions.  <A,  and  <C. D>  arc  private  peers,  while  <B,  O  arc  public  peers.  The 
only  cnd-io-cnd  peers  arc  <A.  £)>  ,  which  arc  neither  {»ivatc  nca^  public  peers. 
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Figure  4.3.  Simple  Chain 


Public  peers  are  defined  only  by  direct  connections,  but  private  peers  are  defined  by  a  recurrence  involving 
interfaces,  chains,  and  corresponding  components.  In  the  base  case,  the  two  halves  of  an  interface  or  complete 
splice  are  private  peers.  In  the  recursion  step,  corresponding  components  of  end-to-end  peers  are  private  peers. 
This  is  another  significant  feature  of  cnd-io-end  chains:  private  peers  can  cross  an  arbitrary  number  of  shell 
boundaries.  Figure  4.4  illustrates  how  these  rules  interact.  Interfaces  <A,B>  ,  <C,  D>  ,  <£,  F>  ,  and  <G,  H>  have 
relevant  structure 


(  ircemal:  e.’xpoir.i, 
ex:.e:r.al:  eiC.ersior., 
structure:  foe 

1 

•Interfaces  <A/,  A'>  ,  and  <0,  P>  have  structure 

(  internal:  extetsicr,, 
external:  enc^ir.*,, 
structure:  bundle 
{  internal:  ireskec. 
external:  extensior., 
structure:  foe 
(  inttmal:  masked, 
extejnul:  extensicn, 
stnxsure:  foo 

^  ] 

i 

By  the  base  case,  each  interface  defines  a  set  of  private  peers.  The  public  (connected)  peers  are  <BJ>  ,  <£),  K>  , 
<7,  £>  ,  <L,  G>  ,  and  cV.  0>  .  By  definition,  <Af ,  N,  G,  P>  is  an  cid-lo-end  chain,  and  therefore  </,  J>  and  <K,  l> 
are  private  peers.  From  this,  we  obtain  <A,  B,  /,  7,  £,  F>  and  <C,  £),  K,  £,  G,  H>  as  additional  cnd-io-cnd  chains.  If 
fexj  is  a  simple  structure,  the  last  two  chains  are  the  only  ones  that  might  require  implementation. 


Figure  4.4.  Complex  Chain 
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4.2.4.  Multiple  Knd-to-Knd  Peers  and  Chains 


The  semantics  of  one-to-one  communication  patiems  arc  straight  forward.  An  operation  like  send,  for  simple 
views,  or  new.  for  complex  ones,  invoked  at  an  endpoint  of  an  end-to-end  chain  has  the  appropriate  effect  at  the 
other  end  However.  HPC  mulucasiinc  allows  both  multiple  end-to-end  peers  (Figure  4.5)  and  multiple  end-f.c-end 
chains  between  a  single  pair  of  peers  (Figure  4.6). 


Figure  4.5.  Multiple  End-io-End  Peers 


Figure  4.6.  Multiple  End-io-End  Chains 

Tliis  raises  a  vcr>'  important  question.  Is  communication  associated  with  peers  or  chains?  In  HPC,  chains 
define  connectivity,  not  implementation,  so  one  copy  of  a  message  should  be  delivered  to  each  end-to-end  peer, 
regardless  of  Lhc  number  of  paths  to  a  peer  (and  subject  to  the  semantics  of  the  communication  medium).  Similarly 
for  multiplex  end-to-end  chains,  only  one  new  component  is  created  for  each  remote  peer  no  mailer  how  redundant 
the  paths  arc. 

The  semantics  of  multiple  peers  and  chains  can  be  summarized  as: 

(1)  Redundant  chains  between  end-to-end  peers  arc  equivalent  to  a  single  chain. 

(2)  An  operation  at  an  endpoint  is  reflected,  in  a  mechanism-dependent  way.  at  each  of  its  end-to-end  peers. 

(3)  Operations  at  multiple  end-to-end  peers  arc  reflected  as  multiple  operations  from  a  single  peer. 

4J.  Management  Properties 

Views  have  some  additional  properties  that  arc  maintained  by  the  HPC  system  for  (he  use  of  agents.  These 
arc  static  role  and  type  labels,  useful  for  identification,  and  a  dynamic  indication  of  reachability,  useful  for 
triggering  application -level  flow  control  and  authentication.  In  general,  however,  an  indq)endeni  property  service 
should  make  these  propcnics  available  to  clients,  not  the  HPC  system. 
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4.3.1.  Role  and  Type 

In  addition  to  its  structure  (simple,  bundle,  etc),  each  view  has  a  fixed  role  and  type.  Roles  and  types  are 
arbitrary,  uninterpreied  s'jings  of  characters.  Type  describes  the  data  representation  or  application-level  protocol 
expected  at  an  interface.  Typical  types  would  Ixj  bvr.es'.’-ea-r,  rAnc-jpoa-.e,  and  self-oescrin.nc-ciacaara-.  For  complex 
interfaces,  type  generally  describes  the  constraints  or  interactions  betw-een  the  component  streams. 

Role  defines  the  abstract  behavior  presented  or  expected  at  an  interface.  The  UNIX  conventions  of  stdir., 
and  s'jxrz  are  well-known  roles.  The  Accent  kernel,*  pert  is  another  standard  role  for  a  communication 
interface. 

An  increasingly  common  configuration  is: 

i  ro.e:  "cerr.i.-tl 
r^-pe:  "X  wircc-*.  z.ier:.". 
srp-jcrure:  TCT/IP  in-out 

The  complete  description  of  the  conventional  U>«iIX  interface  to  a  process  is: 

!  .'•o^e:  "‘wNIX  stcLc", 
typrj:  "rtcuc  c^’ter.rea-r", 
rTterral:  e.Tqxt.-.t, 
external ;  enc^i.-.t, 
structure:  faixidle 
!  rule:  -stdi.''', 
t>pe:  ''r^’tfeitrea.-", 
internal:  encpct.tt, 
external:  extensto.t, 
structure:  sirple  UKIX-strea.’n  in 

1 

!  role:  "stdaut", 
t>'pe;  "cytrstrear", 
internal;  enepeutt, 
external;  extmsicr., 
stricture:  slrple  UKIX-strBar  out 


;  role:  "staerr" 
type:  "ry-estrea-", 
ir.temal:  encx^-.t, 
external;  exter.s:o.-., 
structure:  sirirle  IKDC-strea-r  out 

] 

1 

Role  and  type  properties  provide  a  general  mechanism  for  user -defined  semantic  interpretation  of  interfaces. 
Each  agent  is  free  to  interpret  labels  as  it  sees  fit  and  is  not  required  to  understand  any  particular  label  Simple 
agents  will  check  labels  for  conventional  compatibility  (e.g.,  role  stdin  connected  to  stdoit).  Sophisticated  agents 
might  interpose  protocol  or  representation  translation  objects  between  views  that  arc  not  immediately  compatible 
(e.g,,  type  vAx-f iatt-«^re«r,  conncctcd  through  a  conversion  process  to 

A  software  development  system  that  supports  strong  typing  and  separate  compilation  could  generate  distinct 
roles  tor  each  interface  and  use  ordy  agents  that  validate  roles  against  a  database  before  establishing  connections 
bet  wee  1  objects.  Another  good  use  for  roles  and  separate  interfaces  would  be  to  distinguish  benveen  the  various 
entry  points  of  an  object  accepting  remote  procedore  calls  or  Ada-style  rendezvous.  A  good  development  system 
can  exploit  liic  type  information  ro  make  available  at  runtime  a  detailed  description  of  the  language  and  runtime 
dependent  message  types,  remote  procedure  call  arguments  and  return  types,  and  so  forth.  This  can  be  used  by 
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agenLs  ic  help  ensure  the  sensible  interconnection  of  objects.  Because  interface  labels  specify  the  proper 
inieiprciaijon  cf  data  transmitted  through  an  interface,  they  arc  valuable  in  monitoring  and  debugging.  T>pc 
information  can  make  the  difference  between  low-level  packet  traces  and  raw  dumps  and  symbolic  debugging  of 
communicated  data  in  a  format  relevant  to  the  application. 

4.3.2.  Liveness 

Usually  an  agent  can  not  control  or  even  trace  a  complete  communication  path,  but  even  the  most  nidimeniary 
control  techniques  require  some  indication  that  two  processes  arc  in  contact  with  one  another  and  remain  so.  A 
dynamic  view  propeny  called  liveness  provides  this  indication  based  on  viable  endpoints,  which  arc  simple 
endpoints  backed  up  by  transpxirt  media  and  complex  endpoints  in  complex  domains. 

While  a  view’s  structure  is  fixed,  us  liveness  changes  to  reflect  the  communication  paths  that  pass  through  tlie 
view.  A  viable  endpoint  is  reachable  through  any  alive  view.  When  a  viable  endpoint  is  alive,  a  useful  end-to-end 
chain  exists.  Row  control  and  (rc)autheniicauon  procedures  can  be  triggered  by  changes  in  liveness. 

Liveness  is  com.puted  by  examining  the  chains  that  start  with  a  view  and  continue  with  its  private  peers.  The 
view  is  fl/ive  if  at  least  one  of  these  chains  terminates  in  a  viable  endpoint,  it  is  dead  if  none  of  these  chains 
terminates  in  a  viable  endpoint,  and  it  is  suspended  if  neither  of  these  two  liveness  values  can  be  confirmed  due  to  a 
network  partition. 

423  J.  General  P»*cptrriies 

There  is  no  reason  arbitrary  properties  couldn’t  be  associated  with  interfaces,  letting  convention  determine  the 
significance  of  role  and  i>'pe.  For  example,  the  X  window  system  incorporates  a  general  facility  for  associating 
properties  with  windows.  However,  many  things  can  have  properties,  not  just  HPC  views  or  X  windows.  Propeny 
registration  is  a  problem  that  should  be  solved  once,  not  repeatedly,  and  that  suggests  a  general  context  or  property 
service,  independent  of  the  HPC  agent  or  other  services. 

Role.  typ:.  and  more  general  properties  can  be  registered  by  clients  without  involving  the  HPC  system.  This 
applies  to  all  static  and  many  dynamic  properties,  most  of  which  are  uninterpreted  by  the  HPC  system.  However, 
liveness  is  an  example  of  a  property  with  a  non-trivial  definition,  interpreted  by  HPC.  arid  with  abstraction  and 
protection  boundaries  that  deny  any  single  client  access  to  all  tiie  data  needed  to  compute  it  Such  properties  should 
be  computed  by  the  HPC  system,  but  made  available  to  clienu  through  the  pioprrt)’  service. 

Expedient  com^omises  develop  in  the  absence  of  a  property  service.  On  one  hand,  a  system  that  depends  on 
a  noncxisitni  serv'icc  is  not  very  useful,  therefore  the  X  developers  provided  the  service  diemselves.  On  the  ocher 
hand,  a  general  property  service  is  peripheral  to  the  issues  this  dissertation  is  intended  to  address.  Only  some 
properties  arc  needed  to  tell  otherwise  anonymous  views  apart.  So.  just  two  ^ledfic  properties  were  built  into  HPC. 

Properties  arc  u^ul  for  objects  as  well  as  views.  Consider  UNIX  filter  processes,  ail  with  the  same 
conventiona]  interfaces  and  radically  different  behaviors.  Role  or  a  stmilv  property  would  identify  Uie  function  of  a 
given  filler.  For  simple  objects  (processes),  seven]  properties  would  be  useful,  including  the  physical  location  of 
the  process,  dic  image  file  from  which  it  was  animated,  and  its  initial  arguments. 

The  iKk  of  object  properties  is  especially  acute  during  maintenance,  as  opposed  to  construction.  As  it  stands, 
all  processes  arc  indisunguishablc  after  animation  A  post  mortem  examination  of  a  failed  process  shows  only  an 


empty  object.  There  is  no  record  of  the  image  that  ran  in  the  object,  or  what  its  parameters  were,  making  it  difficult 
to  knov-'  how  to  repair  the  failure.  We  made  no  provision  for  recording  properties  of  objects.  This  is  arguably  a 
significant  oversight,  but  the  correct  soluuon  is  an  independent  propeny  service  not  specific  to  HPC. 

4.4.  Di.scu.ssiiin 

Composiuon  in  HPC  is  expressed  by  its  communication  structures.  The  features  of  connections  and  interfaces 
have  sonic  unexpected  interactions  w-ith  each  other,  with  the  abstraction  expressed  by  nested  objects,  with  the 
protection  system,  w  ith  partitioning  due  to  distribution,  and  with  the  semantics  of  real  IPC  implementations.  We 
highlight  the  most  important  such  interactions  here. 

4.4.1.  Orientation 

Until  now,  we  have  not  been  careful  about  specifying  the  orientation  of  simple  structures.  Surprisingly,  there 
is  no  consistent  way  to  label  an  object’s  interfaces  such  that  both  the  description  of  flow  relative  to  the  hierarchical 
object,  and  the  description  of  flow  relative  to  the  views,  arc  intuitive  and  indepciident  of  context.  This  is  a  subtle 
issue  that  may  lead  to  complex  or  error-prone  programming.  There  arc  three  related  issues  to  clarify:  the  effect  of 
the  hierarchy  on  orientations,  the  flow  direction  specified  by  an  orientation,  and  the  side  of  an  interface  affected  by 
an  orientation. 

If  an  orientation  label  is  applied  rclaive  to  an  object,  the  labelling  of  Figure  4.7  results.  These  labels  arc 
appropriate  in  the  hierarchical  interpretation  j>roccss  structure,  and  initially  appear  to  be  the  right  ones  to  Uoe. 
However,  Figure  4.7  shows  that  i.n  is  sometimes  compatible  with  in  and  other  times  v'ilh  out.  Views  can  not  be 
checked  for  complementary  orientations  without  considering  their  relative  positions  in  the  hierarchy  in  addition  to 
their  labels.  Worse,  there  is  no  local  indication  of  the  direction  of  communication  flow.  (Communication  betw-een 
an  in  and  an  out  may  flow'  in  cither  direction,  depending  upon  context.) 


Figure  4.7.  Orientation  Relative  to  Objects 

Automatic  reconctlituon  of  inconsistencies  in  the  hierarchy  due  to  network  partitions  sometimes  requires 
taking  a  shell  s  parent  and  making  it  a  sibling.  This  operatiem  (Section  6.2.3)  would  be  unnecessarily  complicaied 
by  tiic  need  to  reorient  us  interfaces  to  preserve  complementary  pairs  of  views. 

If  we  apply  onentation  labels  based  only  on  the  flow  of  communication  and  independent  of  the  hierarchy,  we 
have  two  more  labellings  as  shown  in  Figures  4.8  and  4.9.  The  first  labelling  is  appropriate  for  describing  the  flow 
relative  to  the  external  views  of  an  object,  while  the  second  is  appropriate  for  describing  liic  flow  from  the  inicmal 


Figure  4.9.  Orientation  Relative  to  Internal  Views 

We  select  the  labelling  of  Figure  4.8  because  it  provides  the  expected  labels  for  the  (external)  abstractions 
presented  by  objects.  The  views  of  an  interface  (indeed,  any  pair  of  peers)  have  complementary  orientations, 
therefore  only  the  orientation  of  the  external  side  of  an  interface  must  be  given  explicitly  in  an  interface  structure. 
Confusion  over  the  orientations  of  internal  views  is  possible,  but  there  is  a  consistent  rule  governing  communication 
flow.  Messages  flow  out  of  of  a  view  where  receives  arc  performed,  and  into  views  where  sends  arc  performed.  For 
RPC  mechanisms  where  the  orientations  arc  client  and  server,  a  client  process  makes  calls  on  a  server  view,  while 
the  server  process  accepts  and  replies  on  a  client,  view. 

4.4.2.  Endpoint/Extension  Promotioo 

The  endpoint/extension  distinction  has  several  aoractivc  features,  but  it  panialiy  negates  the  informatkm 
hiding  provided  by  domain  boundaiics.  We  want  to  treat  single  processes  arxl  opaque  complex  objects 
indistinguishably.  However,  we  aiso  don't  want  to  send  on  simple  endpoints  anywhere  except  inside  a  process,  and 
we  don't  want  to  suppon  connections  and  interfaces  inside  a  process,  at  least  not  any  further  than  its  boundary  to  the 
outside  world. 

These  arc  conflicting  desires.  If  a  complex  object  is  animated  inside  a  shell  with  a  simple  endpoint,  it  can't 
use  that  interface.  If  a  simple  object  (process)  is  animated  inside  a  shell  vriih  an  extension  of  any  scruaure.  it  can't 
use  that  interface.  Conversely  in  either  case,  if  the  interface  is  used,  the  simplicity  or  compIcKiiy  of  the  object  can 


K'  determined  from  ouL<;ide  The  erector  of  an  object  should  not  know  the  complex i:.  >:  implcmcnLauon  or  hov. 

II  manages  iLs  micnia!  communications,  either  before  or  after  the  object  is  created 

Our  solution  is  to  nilou  an  obicct  to  promote  its  internal  views  to  either  cndo)inL-.  or  extensions,  as  desired, 
r^romotion  actualK  replace^  die  enure  interface;  w-hen  the  object  dies  or  is  kilie^l,  tf:e  in:err,a'  \’,ews  do  not  revert  to 
the  original  structure.  The  interna!  components  of  a  complex  view  arc  created  or  do.NUoycd  (not  masked)  as 
necessary  to  match  die.  change  in  structure. 

To  eliminate  some  inconsistencies  that  could  arise  during  network  paruuon.s.  the  dom.am  boundar)'  shell  and 
all  its  interfaces  must  be  replaced  v.-henever  an  interface  is  promoted.  This  preser\cs  the  invariance  of  a  shell’s 
interfaces,  which  is  an  important  assumption  of  the  conflict  ’’esoluiion  procedures.  To  reduce  the  possible  conflicts 
further,  promotion  is  only  allowed  during  the  animation  and  investure  operations  This  is  a  simplifying  feature, 
rather  than  a  cniical  one. 

4.4J.  Taps 

Corresponding  components  are  easily  defined,  but  the  tap  problem  illustrates  a  limitation  with  multiplex 
components.  A  valuable  feature  of  dynamic  communication  structure  is  the  ability  to  insen  a  monitoring  process,  or 
lap.  at  any  point  along  a  communication  path  to  debug  or  filter  the  contents  of  the  communication.  Figure  4.10 
illustrates  the  insertion  and  removal  of  a  tap  in  the  middle  of  a  connection. 


Figure  4.10.  Tap  on  a  Path 


Ignoring  changes  in  livcness.  it  is  possiolc  to  insen  and  remove  i  up  iransparcnily  at  any  lime  on  any 
sinjciurc  except  a  multiplex  interface.  For  multiplex  simcuircs.  the  up  must  be  inserted  before  a  component  path  lo 
be  monitored  is  created  and  must  remain  in  place  until  the  last  path  has  been  destroyed. 

For  simple  structures,  ups  forward  communication  from  one  side  to  the  ocher.  More  generally,  ups  intercept 
the  results  of  endpoint  operations  from  one  side  and  reinvoke  the  operations  on  the  other.  Of  the  complex 
structures,  only  r»-  w  a  multiplex  endpoint  has  an  eflect  at  its  end*u>-end  peer.  A  multiplex  up  df-ifirts  the 
automiuc  creation  of  a  new  component  on  one  side,  and  inve^e*^  new  on  the  other  side.  This  in  turn  creates  a 
component  at  the  ulumate  end  of  the  intercepted  path.  The  up  remembers  which  components  on  each  side  match 
up  as  pans  of  an  intercepted  component  path,  so  that  communication  on  the  components  can  be  properly  fmwarded 
through  the  up 

The  prot)lem  comes  in  removing  the  Up.  The  obvious  approach  of  disconnecting  the  up  and  reesublishing 
the  intercepted  connccuon  won’t  work  because  of  the  multiplex  component  correspondence  rule.  New  views 
created  while  the  up  is  m  place  only  have  corresponding  components  in  the  tap's  interface.  When  the  up  is 
removed  those  views  arc  permanenUy  dead  Similarly,  oomponem  paths  csublished  before  the  up  can  not  be 
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monitored  (except  at  the  or.ginal  corresponding  components.) 

We  did  not  aiicmpi  a  solution  to  this  problem  Definition  of  corresponding  component  pairs  by  agents  is  a 
possibility  worth  investigation.  Ideally,  a  solution  to  the  tap  problem  would  also  address  the  cycle-avoiding  and 
authentication  problems  discu.s.scd  below. 

4.4.4.  Rellcctors 

Besides  the  multiple  paths  discussed  earlier,  multicasting  can  build  some  non-obvious  communication 
patterns.  Reflectors,  arc  one  of  these.  Reflectors  allow  a  chain  to  pass  through  the  same  view  twice,  once  in  each 
direction.  (Figure.  4.1 1.) 


Figure  4.11.  Reflecting  Path 

In  order  to  connect  die  components  of  the  multicast  endpoint  to  each  other,  they  (and  all  their  component 
structurts)  must  have  a  neuirai  orientation,  such  as  in-out.  This  neutrality  necessarily  applies  to  the  entire  chain  due 
to  the  ccmpaiibiUty  restriction.  Mechanisms  lacking  the  ability  to  talk  to  themselves,  like  RPC,  can  not  have  neutral 
orientations. 

Since  reflection  is  just  a  specific  ccmsequence  of  multicasting,  it  has  all  the  usual  effects  on  component 
structures.  Suppose  an  endpoint  is  one  of  its  own  end-to-end  peers  due  to  reflection.  If  it  is  a  multiplex  view,  then  a 
operation  on  it  will  create  two  components.  If  it  is  multicast,  communication  through  any  of  its  components  will 
be  reflected  through  all  of  its  components.  If  it  is  a  bundle,  the  paths  through  each  of  its  components  will  be 
individually  reflected. 

4,43.  Cycles 

A  trivial  cycle  is  shown  in  Figure  4.12.  It  has  no  endpoints  and  represents  no  path  between  processes. 
However,  with  mulficasiing,  cycles  can  be  introduced  in  the  middle  of  end-to^d  chains,  as  shown  in  Rgure  4.13. 
For  every  such  chain,  there  will  be  an  infinite  number  of  others,  each  with,  one  more  repetiuon  of  the  cycle. 


Figure  4.12.  TrivUI  Cycle 


Figure  4.13.  Complex  Cycle  due  to  Multicasting 

These  repetitions  have  no  pathological  effects  on  the  HPC  model,  because  redundant  chains  have  the  effect  of 
a  single  chain.  However,  cycles,  reflectors  and  multiple  paths  require  a  subtle  algorithm  to  compute  end-to-end 
chains  without  infinite  looping  or  expensive  checks  for  overlapping  cycles  and  paths. 

The  prohibition  of  cycles  might  be  suggested  to  simplify  the  system,  because  a  system  without  them  can 
express  all  the  same  useful  communication  patterns,  but  there  is  no  obvious  benefit  to  prohibiting  cycles.  First,  it  is 
the  agent's  job,  not  the  system’s,  to  determine  which  chains  arc  sensible  and  which  arc  foolish.  From  the  system’s 
perspective,  the  work  required  to  detect  cycles  in  order  to  forbid  them  is  not  less  than  the  work  needed  to  detect  and 
ignore  them.  From  the  agent’s  pcrspccuvc.  livcncss  provides  a  rudimentary  protection  against  waiting  indefinitely 
for  communication  from  useless  channels.  It  must  be  admitted,  however,  that  an  agent  requires  more  information  to 
nuke  a  truly  infcwmcd  choice. 

Forbidding  cycles  actually  has  some  undesirable  consequences.  A  cycle  is  detected  only  when  the  last  link  is 
created,  typically  due  to  a  connect.  Responsibility  for  the  cycle  (and  the  error)  is  distributed,  but  blame  is  not 
There  is  no  way  to  decide  which  agent  should  best  lake  action,  and  the  error  is  detected  arbitrarily  long  after 
construction  of  the  cycle  begins.  This  is  aggravated  by  network  partitioning.  During  a  partition  the  prohibition 
against  cycles  can  not  be  gurtrarueed.  When  a  cycle  is  detected  upon  network  merger,  it  must  be  reponed  lo  an 
agent  for  removal.  However,  there  is  no  obvious  *iast"  link  in  the  cycle  and  therefore  no  obvious  agent  lo  hold 
responsible.  Yet  removal  of  ihe  cycle  must  be  enforced  (else  cycles  really  are  allowed).  Most  inconsistencies  of 
this  type  arc  forced  to  a  safe  state  by  suspending  all  their  views  until  an  agent  resolves  the  conflict,  but  that 
technique  has  no  significance  for  cycles. 

Forbidding  cycles  also  mearts  that  operations  like  connecting  two  views  arc  sometimes  illegal  on 
information  iha*  is  not  available  to  the  concerned  agents.  The  cycle  can  involve  masked  structure  that  is  not  even 
accessible  ai  the  views  in  quesuon.  This  problem  could  be  avoided  if  agents  had  access  to  connectivity  information 
'.hai  idcnuficd  views  at  the  end  of  (parual)  chains  in  addiuon  to  the  anemymous  livencss  propeny. 
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4.4.6.  Authentication 

Many  access  control  policies  are  a  function  of  a  requester's  identity  as  well  as  the  type  oi  access.  There  are 
two  extreme  views  concerning  the  authendcaiion  of  a  communicating  peer.  The  trusting  view  is  that  every 
connection  has  been  made  correciiy.  Every  interface  of  z  complex  object  is  connected  to  a  child  that  is  authorized 
to  receive  or  provide  the  corresponding  service.  As  a  result,  processes  are  always  put  in  contact  with  authorized 
peers.  Authentication  is  a  de  facio  property  of  process  structure.  Many  hosts  take  this  view  regarding  their  machine 
console.  Anyone  at  the  console  has  privileges  by  definition. 

The  suspicious  view  is  that  an  end-to-end  peer  could  be  anything  at  all.  Through  error  or  malice,  every 
connection  might  be  to  an  unaudiorized  peer  and  authentication  must  be  carried  out  whenever  a  new  peer  is 
established.  Most  hosts  take  this  view  regarding  their  terminals.  User  identiiy  must  be  established  for  each  session. 

Authentication  procedures,  encryption,  and  related  topics  are  outside  the  scope  of  HPC,  but  we  iiave  a  clear 
obligation  to  provide  a  mechanism  to  inform  suspicious  objects  when  a  peer  is  (re)esiablished  and  must  be 
authenticated.  Failing  to  end  a  session  when  a  user  loses  telecommunication  contact  is  a  common  security  problem. 
The  next  user  to  establish  contact  with  the  host  gains  the  privileges  of  the  previous  user. 

View  liveness  is  one  mechanism  for  reporting  changes  in  connected  peers.  For  one-to-one  chains,  HPC 
guarantees  that  an  endpoint  will  make  a  transition  to  the  dead  state  whenever  its  end-to-end  peer  changes.  Uveness 
can  be  too  conserxative  to  preserve  abstraction,  because  a  complex  object  can  not  change  its  internal 
implementation  without  triggering  reauthendcation.  In  principle,  we  should  authenticate  an  abstract  object,  not  the 
collection  of  leaf  processes  that  happen  to  implement  its  services.  If  the  object  is  misted  to  provide  the  appropriate 
service,  it  should  be  trusted  to  manage  its  implementation. 

In  other  cases,  liveness  is  too  weak  for  safety,  because  mtdiicasi  interfaces  allow  the  replaccmeut  of  an 
authentication  peer  with  an  unauthenticated  one  without  signalling  a  change  in  liveness,  and  because  the  addition  of 
(unauthorized)  peers  after  the  first  live  pec:  is  not  reported. 

In  a  trusting  environment,  the  livencss  property  could  be  supplemented  by  notifying  objcco:  when  their 
immediate  external  connections  change.  Neighboring  objects  could  reauthendcate  at  the  appropriate  level  of 
abstracuon.  (This  mechanism  would  have  to  reflect  changes  in  the  degree  of  muldcasdng,  as  well.)  In  a  suspicious 
system,  one  could  record  the  pardal  chains  reachable  through  each  view.  By  associating  an  authendcated  peer  with 
a  particular  view  on  a  pardcular  chain  (thus  some  specific  object),  sjbsdtudon  or  insertion  of  an  unauthendcated 
object  on  the  near  side  of  the  selected  view  can  be  detected,  while  ignoring  changes  on  the  far  side  (inside  the 
object). 

Recording  chains  can  be  used  to  avoid  cycles  and  address  the  lap  problem,  as  well  as  to  trigger  authenucadon. 
Cycles  can  be  detected  or  avoided  by  checking  connected  views  for  membership  in  each  other's  chains.  If 
muldplex  corresponding  components  are  made  explicit,  some  form  of  user  selection  of  correspondence  to  address 
the  tap  problem  becomes  possible.  Unfortunately,  providing  so  much  informadon  about  aibitrarily  remote  process 
structure  to  facilitate  one  aspect  of  security  is  difficult  to  recoocile  wifli  abstracdon,  informadon  hiding,  and  access 
control.  An  adequate  soludon  to  these  problems  remains  to  be  found. 
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4.4.7.  Multicasting  Semantics 

HPC  demonstrates  that  essential  communication  patterns  can  be  expressed  structurally.  Multicasting  is 
cntical  for  many  applications,  and  it  has  been  easily  integrated  into  the  HPC  model  as  a  structural  feature,  rather 
than  an  addressing  or  special  transport  mechanism.  Unfortunately,  even  simple  uses  of  multicasting  may  not  be 
compatible  with  the  semantics  of  the  underlying  IPC  mechanism. 

'  s'rJcrt.’jL't:  rrulticart 

!  stiuci'JTB:  sirplc  ?C?/IF  irr-ojt 

]  ’ 

Protocols  such  as  TCP/IP  designed  for  reliable  data  streams  between  two  processes  make  very  strong  assumptions 
about  exactly  one  set  of  peer  processes.  The  HPC  system  can  prohibit  such  mechanisms  at  the  leaves  of  multicast 
u^ees,  or  do  something  more  complex  »t.an  c.eaie  a  single  transport  medium  to  implement  the  effect  of  multicasting. 
For  TCP/IP,  a  central  redistribution  process  is  introduced  to  replicate  and  merge  individual  TCP  streams.  This 
pro\ides  a  useful  serv-ice,  but  it  certainly  doesn’t  provide  the  general  semantics  of  reliable  multiple  delivery. 

It  is  often  not  obvious  how  to  use  an  existing  EPC  mechanism  in  an  HPC  multicast  context.  For  example,  the 
behavior  of  one  mailbox  shai  i  among  all  peers  is  not  the  same  as  a  separate  mailbox  shared  by  each  pair  of  peers. 
In  the  first  case,  only  one  peer  rernoves  a  copy  ''cssage,  while  in  the  second,  all  peers  get  a  (separate)  copy. 

There  is  not  enough  experience  with  multica.  jiisms  to  select  a  general  set  of  principles  for  extending 
con>'cntional  IPC  mechanisms  to  multicasting.  Ihe  interactions  between  multiple  delivery  and  communication 
mechanisms  with  varying  amounts  of  state  will  probably  remain  unclear  in  the  foreseeable  future. 

4.5.  Communication  Taxonomy 

Direct  interactions  between  processes  in  HPC  arc  determined  by  three  factors:  logical  configuration,  transport 
medium  implementation,  and  communication.  Each  factor  is  controlled  separately  by  distinct  agents.  Configuration 
is  a  dynamic,  ircrcmcnuil  process  of  modifying  abstract  structure  and  responsibility  for  one  end-to-end  chain  is 
distributed  over  many  agents  The  HPC  system  is  responsible  for  creating  and  destroying  transport  media  to  rcfiect 
the  changing  logical  configuration.  And  the  conten:  of  communiauion  is  controlled  solely  by  the  processes  at  the 
ends  of  the  uanspuri  media. 

In  most  systems  these  distinct  functions  can  net  be  sqiarated.  Usually,  configuration  is  merged  with 
communicauon.  For  example,  configuration  in  a  link-based  system  (DEMOS,  Charlotte,  Accent/Mach)  is 
accomplished  by  sending  a  link  in  a  message,  while  connection  setup  in  TCP/IP  is  controlled  by  the  two 
communicating  processes.  A  proper  ttxoiXMny  of  computer  communication  should  distinguish  these  functions. 

While  the  correspondence  is  not  exact,  in  terms  of  the  ISO  seven-layer  model,  communication  is  user 
invocation  of  the  transport  layer,  composition  is  user  invocation  of  the  session  layer,  and  implementation  is  carried 
ou:  by  the  session  layer.  (Tabic  4.1.) 


HPC  Feature 

ISO  Layer 

Hov^  Related 

role 

application 

purpose,  intent 

type 

presentation 

data  encoding 

composition 

session 

user  invocation 

implementation 

session 

layer  function 

structure 

transport 

user  specification 

signalling 

transport 

user  invocation 

Table  4.L  Rebtion  lo  the  ISO  Model 


Different  IPC  mechanisms  generally  ha  ^e  different  operations  for  communication.  For  example, 
communicating  with  messages  involves  deciding  when  to  send  and  receive  messages,  and  what  the  contents  of 
messages  should  be.  Communicating  with  a  semaphore  Involves  deciding  when  to  wait  (P)  and  signal  (10- 
Signalling  with  RPC  involves  deciding  when  to  call  a  procedure,  when  to  return  from  a  call,  and  what  the  arguments 
and  return  values  should  be.  (Table  4.2.) 


Mechanism 

Configuration  Operations 

HPC 

connect,  disconnect 

CONIC 

link,  unlink 

Hydra 

connect,  disconnect 

RPC 

bind 

socket 

bind,  listen,  accept,  connect,  disconnect 

memory 

link,  load,  address 

file 

create,  open,  close,  inherit 

mailbox 

create,  name 

link 

create,  transfer 

filter 

set-filter 

Linda 

set-pattern 

Table  4.2.  Communication  Operations 


Communication  Operation^, 

HPC 

unspecified 

CONIC 

send,  receive,  reply 

Hydra 

send,  receive,  reply 

RPC 

call,  accept-reply 

socket 

various,  usually  send,  receive 

memory 

read,  write,  P,  V,  fetch-and-phi 

file 

read,  write,  seek 

mailbox 

deposit,  withdraw 

link 

send,  receive 

filter 

send,  receive 

Linda 

insert,  remove,  retrieve,  eval-and-expand 

Table  4.3.  Configuration  Operaiions 

The  configuration  operations  for  these  three  example  mechanisms  are  deciding  whom  to  send  a  message  to 
(whom  to  receive  from),  deciding  which  processes  have  access  to  the  semaphore,  and  deciding  which  client  stubs 
are  bound  to  which  server  entries,  respectively,  (rable  4.3.)  Implemeniauon  of  conventional  EPC  mechanisms  is 
usually  triggered  directly  by  configuration  operations  and  managed  by  Uie  host  operating  system. 

Ccmmunication,  composition,  and  impIemer;iation  are  not  just  conceptually  dilfereni  activities.  If  distributed 
programming  is  to  incorporate  more  complex  abstractions  than  the  well-known  dienl-scrvcr  model,  these  activities 
must  actually  be  carried  out  by  different  agents.  We  argue  that  configuration  must  be  an  activity  distributed  among 
multiple  agents.  Further,  the  configuring  agents  are  generally  not  the  same  as  the  communicating  processes.  These 
statements  arc  already  true  in  simple  ways  for  systems  other  than  HPC.  Designers  of  future  IPC  mechanisms  should 
provide  for  their  full  realization. 

To  sec  that  this  funrtional  separation  is  important  to  muldproccss,  distributed  systems  more  general  than 
HPC,  consider  a  system  offering  only  one  service  with  multiple  server  processes,  a  file  service,  say.  Every  process 
cither  belongs  to,  or  Is  a  client  of,  the  file  service,  and  the  multiprocess  implementation  of  the  file  service  is 
transparent  to  the  clients.  Before  a  client  and  a  server  process  communicate,  the  client  must  decide  to  access  the  file 
service  and  some  agent  in  the  file  service  must  decide  which  server  process  is  to  handle  the  client’s  request.  Neither 
the  client,  nor  the  file  service  agent,  can  decide  nnilatcrrJiy  to  bind  the  client  and  server  processes.  The  logical  path 
between  the  communicating  processes  has  two  sepmenis.  (There  art  two  independent  contributions  to  the  decision 
\o  bind  that  particular  pair  of  processes). 

To  exten^  this  argument  somewhat,  assume  that  a  protection  system  allows  a  process  in  one  process  group  lo 
access  only  public  features  (such  as  expoited  names  or  interfaces)  of  other  groups,  and  that  groups  do  not  export  Lhc 
names  of  their  internal  processes.  This  already  rules  out  the  common  situation  (TCP/IP)  where  a  communicating 
process  5  chooses  iLs  partner  process  P.  If  S  and  P  arc  in  different  groups.  5  can  only  specify  a  public  feature  of  the 
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group  G  10  which  P  belongs.  The  choice  of  P  lo  complete  the  configuration  must  be  made  by  some  process  in  group 
G.  This  process  can  not  be  S  and,  symmetrically,  P  can  not  directly  chose  S. 

Suppose  the  abstraction  presented  by  a  process  group  could  be  implemented  in  terms  of  other,  less  complex 
abstractions.  This  seems  like  a  fundamental  objective  of  any  structuring  system.  One  way  to  do  this  is  allow 
nesting  of  groups,  perhaps  to  bounded  depth.  Another  way  to  achieve  some  of  the  same  effect,  even  with  a  flat 
space  of  groups  (no  nesting),  is  to  forward  communication  addressed  to  one  group  on  to  a  second  group,  bypassing 
all  the  members  of  the  first  group.  (A  null  modem  has  this  kind  of  internal  structure.)  Access  control  or  visibility 
constraints  will  prevent  a  client  in  one  process  group  from  knowing  whether  it  is  interacting  with  a  process  of  a 
server  group  or  some  other  group.  In  such  cases,  configuration  can  involve  groups  that  do  not  contain  either  of  the 
communicating  processes.  Establishing  a  p)ath  through  several  process  groups  requires  the  involvement,  and 
implicit  cooperation,  of  a  configuring  process  in  each  of  them. 

Static  process  structures  avoid  run-tirne  configuration  choices  altogether.  Configuration  is  then  typically  an 
activity  of  human  designers,  while  implementation  is  carried  out  by  support  software.  Communications  remains  a 
run-time  activity.  However,  static  structures  make  the  question  of  disL^huted  and  incremental  configuration  a  moot 
issue  of  design  methodolog)’.  We  suggest  that  in  successful  methodologies  configuration  remains  distributed  and 
incremental,  where  distribution  refers  to  separation  among  specification  modules  rather  than  process  gr  "lups. 


Chapter  5 


Non-Hierarchical  Structure 
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5.  Non-Hierarchical  Structure 

Despite  the  familiarity  of  a  strictly  hierarchical  structure,  there  are  two  reasons  non-hierarchical  relationship 
between  objects  must  be  accomodated.  A  stnci  hierarchy  with  explicit  composition  has  nice  formal  properties,  but 
is  impractical  for  systems  with  real  applications,  even  when  their  structure ’s  static.  Section  5.1  discusses  the  need 
for  transparent  access  to  shared  resources. 

Operations  on  HPC  structure  during  a  partition  can  easily  lead  to  inconsistent  hierarchies.  Strict  hierarchies 
are  insufficient  to  express  the  merge  of  two  suici  hierarchies.  A  full  discussion  of  this  problem  is  deferred  until  the 
next  Chapter,  but  the  necessary  tools  are  developed  here. 

The  first  step  is  separating  shells'  role  in  defining  communication  paths  from  their  role  in  defining  the 
hierarchy.  Section  5.2  extends  the  dual  graph  of  the  object  hierarchy  with  splice  edges  having  no  effect  on  the 
hierarchy,  but  allowing  communication  between  arbitrarily  distant  objects.  Splices  are  then  carefully  integrated  into 
the  protection  system  to  preserve  the  local  appearance  of  a  strict  hierarchy,  and  to  prevent  unwanted  interference 
between  distant  objects.  (Splices  differ  substantially  from  the  mechanism  originally  described  in  [LcFSSJ  and 
[LeFS5].j 

Managers  of  global  services  use  a  promiscuous  splice  facility  to  accept  splices  from  arbitrary  clients.  This 
provides  effective  multiplexing  of  splices  to  complement  multiplexing  of  interfaces.  (Section  5.4.) 

Section  5.5  concludes  with  some  design  interactions  bciw'ccn  splices  and  the  existing  structural  features,  and 
discussion  of  the  pitfalls  and  tradeoffs  of  hidden  violations  of  the  hierarchy. 

5,1.  Transparency 

The  typical  UNIX  filter  program  has  an  input  interface  and  an  output  interface,  and  only  these  interfaces  are 
relevant  to  its  composition  with  other  programs.  In  addition,  it  may  interact  with  resources  like  the  file  system. 
However,  access  to  such  resources  is  transparent  to  the  user  of  the  filter.  That  is.  access  to  external  resources  does 
not  change  how  the  user  sees  the  filter.  The  filter  provides  a  public  .abstraction  (interfaces)  and  transparently  makes 
use  of  additional  private  interfaces  to  external  scn  iccs.  These  Lmplemeniaiion  details  arc  not  pan  of  the  public  filter 
abstraction. 

A  strict  hierarchy  with  explicit  composition  has  nice  formal  properties,  but  is  impi'acticai  for  systems  with  real 
applications.  A  purely  functional  methodology  requires  upper  levels  of  the  hierarchy  to  know,  rid  provide,  all  the 
cxiemai  resources  required  by  lower  levels.  There  is  no  way  for  an  objeci  to  access  a  resource  jnbeknownst  to  its 
parent  unless  it  complete  encloses  that  resource.  This  methodology  has  some  definite  problems. 

•  Abstraction  is  unnecessarily  limited. 


'  There  u  i  refretubic  cUih  m  ihc  use  ^  the  icrmt  tramsportm  end  opa^m.  At  ordm»ry  wordi  they  ere  otMmSiciofy,  but  ei  tochfacel 
icmu  they  bech  mdicaie  thti  cerum  deuiis  ajc  not  vit^  tc  the  user.  The  fyttOBU  oommunity  uses  phnses  like  *vinu«i  iomtoty  u 

irantparatt*.  «hik  the  prognnimtnt  Unfuaf et  ooRit&UDtty  wntes  thm|t  Ukc  Shii  type  u  opaque.”  The  itoon  uamst  in  rJbjfoa-onm^  aywems 
hes  brouthi  bosh  communiuct.  and  their  jaif  on,  into  tntiinMc  oonuet  We  wil}  eoBsismtly  use  ‘'opaque”  to  (ndicase  damain.  and  thus  visibilky. 
bcundsnet,  and  'tnntparent*  to  mdicaie  that  as  object's  abetraoion.  and  ihui  os  mictfaatt,  does  not  dchne  all  of  iis  atfesKuans  vkh  extemsl 
resources  U  thii  muddtci  the  water  further,  we  ask  to  be  forjiven 
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A  module’s  public  inicrfacc  should  define  ii5  absiraciion  and  hide  iLs  implemcniauon.  There  is  no  way  lo  separaie 
ihe  HPC  inierfaccs  used  in  "public”  and  "privaie"  inieraciions. 

•  The  bencfiLs  of  explicit  compo''Hion  arc  lost. 

Ai  higher  levels  of  the  hierarchy  there  is  an  accumulation  of  uninlcrcsiing  "plumbing"  whose  sole  purpose  is 
providing  global  services  to  lou  er  levels.  The  connections  relevant  to  the  higher  absuactions  arc  obscured.  (Figure 

5.1. ) 

•  The  system  is  not  modular. 

Gaining  access  to  a  new  serx'ice,  or  changing  the  implemeof alien  of  an  object,  requires  traumatic  changes  to  the 
object  hierarch)'.  All  shells  between  a  service  and  its  client  must  be  destroyed  and  recreated  with  a  different  set  of 
interfaces  to  provide  a  different  set  of  resources. 

•  The  system  is  not  open. 

Top-level  objects,  including  independent  global  services,  are  only  trivially  composed.  No  connections  are  ever 
made  among  them,  and  they  can  never  interact. 

These  defects  can  be  remedies  by  relaxing  cither  strictly  explicit  composition  or  strictly  nested  abstractions. 
HPC  provides  transparent  violations  of  the  object  hierarchy. 

5.2.  Splices 

Shells  normally  represent  fjoundaries  or  separations.  There  is  an  inside  and  an  outside;  the  inside  defines  an 
object  while  the  outside  defines  its  environment  However,  the  dual  giaph  emphasizes  shells'  role  in 
communication.  Communicauon  between  spaces  must  travel  over  the  edges  representing  shells.  We  will  now 
isolate  this  communication  function,  and  unify  the  previously  distinct  structural  features  of  shells  and  interfaces  in 
the  dual  graph. 

Each  shell  has  a  fixed  number  of  attached  interfaces,  each  of  which  is  the  root  of  a  tree  of  views.  A  shell  can 
be  replaced  by  a  pair  of  bundle  endpoints  bound  as  private  peers.  Its  interfaces  arc  demoted  from  roots  to  the 


Figure  5.1.  Unwanted  Plumbing 
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immediate  children  of  the  new  endpoints.  Chapter  4’s  discussion  of  communication  stnicture  is  unaffected.  Besides 
the  parsimony  of  features,  this  elimination  of  shells  makes  a  nice  sort  of  sense.  A  shell  de6nes  the  abstract  interface 
between  two  spaces  by  grouping  together  several  communication  interfaces,  which  is  the  function  of  bundles. 

The  HPC  structure  visible  to  clients  retains  the  notion  of  shell,  and  the  interfaces,  rather  than  the  bundle 
endpoints,  are  presented  as  the  roots  of  view  trees.  However,  internal  computations  of  communication  paths 
disregard  shells,  and  use  the  bundle  endpoints  instead.  The  (internal)  root  of  a  view  hierarchy  is  always  attached  to 
a  single  space,  and  all  its  descendant  views  are  in  that  space.  The  root  is  always  bound  as  a  private  peer  to  another 
view,  and  this  binding  defines  a  path  of  communication  between  spaces.  (A  related  simplification  is  that  there  is 
exactly  one  direct  private  binding  between  view  for  each  edge  between  spaces,  rather  than  one  for  each  interface  on 
the  shell.  The  corresponding  component  rule  binds  an  number  of  interface  subtrees.) 

We  can  add  now  add  arbitrary'  edges  to  the  dual  graph,  even  multiple  edges  between  the  same  two  spaces,  in 
confidence  that  non-hicrarchical  abstraction  and  composition  are  well-defined.  Each  edge  from  a  space  represents 
one  abstract  interface  between  neighbors,  and  composition  is  defined  by  edges  between  spaces  and  connections 
within  them  An  object  can  present  one  interface  to  its  immediate  parent  in  the  hierarchy,  others  to  its  immediate 
children,  and  still  others  to  unrelated  objects  through  which  it  transparently  accesses  global  services. 

However,  hierarchical  organization  should  not  be  discarded  simply  because  it  can  not  be  used  everywhere. 
Violations  of  the  hierarchy  should  retain  the  appearance  of  a  directed  tree  by  providing  every  edge  with  a  direction 
and  every'  space  (save  the  root  space)  with  exactly  one  parent.  The  second  constraint  means  only  a  rooted  spanning 
tree  can  be  associated  with  the  object  hierarchy.  (We  continue  to  call  its  edges  shells.)  The  remaining  edges 
(splices)  arc  fundamentally  undirected,  affecting  communication  paths  but  no:  the  object  hierarchy. 

Splices  are  presented  to  clients  as  opaque  subtrees  to  disguise  the  non-hieiarchical  stnicture.  Both  ends  of  a 
splice  appear  to  be  infenor  domain  boundaries,  regardless  of  the  real  relationship  between  the  endpoints.  No 
violations  of  a  strict  tree  will  be  apparent  through  inspection  or  traversal  of  a  static  object  hierarchy. 

These  boundaries  allow  splices  that  join  any  pair  of  spaces,  between  domains,  within  one  domain,  or  even  a 
self'loop  on  one  space.  Splices  within  a  domain  must  be  accounted  for  because  merging  domains  through  abdicate 
or  depose  can  easily  transform  splices  between  domains  into  splices  within  one.  Similarly,  merging  spaces  through 
dbciose  can  transform  splices  between  spaces  to  self-loops. 

The  apparent  domain  boundaries  make  it  easy  to  t»ovide  dynamic  behavior  consistent  with  a  strict  hierarchy. 
The  only  operations  applicable  to  an  inferior  domain  boundary  are  kill  and  depose.  The  kill  operation  replaces  an 
arbitrary  subtree  with  an  empty  leaf  space,  and  the  HPC  system  ts  free  to  assume  the  convenient  terminal  policy  die 
for  an  illusory  domain,  so  depose  is  treated  the  same  way.  (Actually,  creating  and  destroying  a  splice  is  a  two-step 
process.  Each  endpoint  is  manipulated  separately,  but  the  entire  splice  is  not  destroyed  until  both  endpoints  have 
been  removed  from  their  incident  spaces.) 

Because  creauon  of  a  splice  must  specify  the  remote  end  of  the  edge,  it  is  a  ripe  opportunity  for  error  and 
malice.  A  file  system  client  can  not  be  responsible  for  tmderssanding  die  internal  management  of  a  global  file 
service  well  enough  to  know  where  to  put  the  other  end  of  a  splice.  The  agent  for  the  file  service  iliould  decide  (hat 
No:  IS  any  agent  allowed  to  unilaterally  change  tlie  internal  smicturt  of  unrelated  domains  that  don’t  wish  to  be 
interfered  with.  Therefore.  spUcing  is  a  two  step  process,  requumg  the  acuve  cooperation  of  agents  controlling  both 
affected  domains 
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First  consider  ihc  dcsiruciion  of  ihc  splice  <A,B>  shown  in  Figure  5.2  by  invoking  kill  or  depose  on  ihc 
(apparent)  shell  -4. 


Figure  5.2.  Complete  Splice 

The  endpoint  A  is  moved  from  its  space  to  a  hidden  space  available  only  to  the  HPC  system,  and  replaced  with  the 
shell  C  and  an  empty  leaf  space  (Figure  5.3). 


Figure  5.3.  Incomplete  Splice 

When  the  other  endpoint  B  is  destroyed,  it  is  similarly  replaced  by  the  leaf  D  (Figure  3.4).  The  splice  is 
destroyed  when  both  endpoints  have  been  moved  to  the  hidden  space. 


Figure  5.4.  Unspliced  Leaves 

Creating  a  splice  inverts  these  steps.  To  splice  D  and  C,  one  agent  invokes  the  splice  operation  with  D  and  C 
as  arguments  (order  is  significant).  D  must  be  an  empty  leaf.  It  is  replaced  by  the  splice  <^,  B>  with  A  hidden. 

If  C  is  an  empty  leaf  with  structure  completnentary  to  D,  HPC  remembers  that  D  was  to  be  spliced  to  C. 
When  the  other  agent  invokes  splice  with  arguments  C  and  D,  HPC  finds  the  hidden  end  of  the  splice  <A,  B>  and 
replaces  C  with  ii 

If  C  and  D  arc  not  compatible,  the  new  splice  will  be  created,  but  it  will  not  be  associated  with  the  remote 
interface.  (Cf.  new  on  a  dead  or  suspended  interface.)  This  ensures  structural  compatibility  for  all  private  peer 
bindings,  while  prevenung  an  agent  from  learning  about  another  domain  by  blind  probing  for  leaves. 

The  splice  operation  is  cooperative,  symmetric,  and  secure.  The  agents  controlling  both  leaves  must 
explicitly  invoke  splice,  and  it  doesn’t  matter  which  invocation  comes  first.  If  one  agent  splices  to  C.  the  only 
effect  on  C  is  that  a  subsequent  splice(C,  D)  will  complete  the  splice.  If  C  is  spliced  to  some  other  leaf,  the 
incomplete  splice  from  D  to  C  is  entirely  ignored.  The  agent  controlling  D  gains  no  information  about  C  unless  and 
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uniil  ihc  agcni  controlling  C  chooses  to  complete  the  splice. 

The  splice  and  enclose  operations  arc  about  equivalent  in  complexity.  Both  create  a  new  pair  of  bound  vicw.s 
and  attach  them  in  spaces.  Enclose  must  partition  a  space,  while  splice  must  look  for  an  incomplete  complcmcniar>' 

splice 

5.3.  Example:  Accessing  a  Global  File  System 

Accessing  a  inp-lc.vel  serv  ice  from  arbitrar>'  points  in  the  hierarchy  is  a  primary  motivation  for  splices.  Here 
we  use  a  file  server  application  to  illustrate  how  such  access  ^n  be  organized,  glossing  over  the  details  of  splice 
creation.  In  many  hie  systems,  a  client  must  perform  a  directory  operation,  such  as  open,  to  obtain  some  abstraction 
or  handle  for  a  file  through  which  file-specific  operations,  such  as  read  or  write ,  must  be  invoked.  The  natural  HPC 
representation  gives  the  client  a  splice  to  the  file  system  directory,  and  an  additional  splice  for  each  open  file.  The 
different  i>pes  of  operations  available  from  a  directory  and  from  a  file  will  be  encoded  in  the  interfaces  of  the 
splices. 

The  example  client  shown  in  Figure  5.5  will  reduce  a  stream  of  rtara  by  adaptive  filtering  and  issue  a 
smoothed  version  of  its  output.  Internally,  it  will  use  files  to  log  filtering  parameter  changes,  for  off-line  analysis  of 
input  characteristics  and  filler  performance,  and  journal  the  input  data  currently  within  the  filtering  window, 
allowing  the  managing  agent  to  recover  or  migrate  the  filtering  process.  Given  an  initial  splice  to  the  file  system 
directory,  the  client  can  negotiate  the  cooperative  creation  of  an  additional  splice  by  fint  telling  the  HPC  system 
where  its  end  of  the  splice  is  to  be  located  and  then  invoking  the  file  system  open  (^ration  with  its  credentials  and 
the  name  of  the  desired  file. 


Filter 


Filter 

File  Sys 

Manager 

Splice 

Figure  5.5.  Hidden  Access  to  Globa!  File  Service 

If  the  file  server  finds  the  operation  acceptable,  it  iclb  the  HPC  systeni  lo  complete  she  splice.  The  server  is 
free  to  do  anything  with  its  end  of  the  splice.  It  may.  for  example,  use  a  separate  internal  process  to  service  each 
.splice  representing  a  file,  perhaps  distributed  across  hosts  to  provide  the  lowest  cost  communication  with  the 
respective  clients.  The  resulting  stnjciure  with  separate  server  processes  is  shown  in  Figure  5.6.  Note  the  clieni- 
server  symmciT)’. 


Figure  5.6.  Client-Server  Symmcir>’ 


5.4.  Promiscuous  Splices 

The  symmetric  splice  operation  presupposes  that  agents  can  negotiate  an  agreement  on  the  leaves  to  splice, 
which  itself  assumes  some  existing  communication  path.  The  ‘  plumbing*’  needed  for  negotiating  splices  is  not 
significantly  less  than  the  plumbing  needed  to  access  external  services  in  the  first  place. 

To  provide  an  escape  from  this  circular  dependence,  the  HPC  agent  reserves  a  small  set  of  promiscuous  shell 
names  for  special  treatment.  Let  5  be  a  promiscuous  shell,  and  A  be  a  compatible  leaf.  If  the  agent  controlling  A 
invokes  splice(A,  S).  the  other  end  of  the  splice  will  be  immediately  installed  as  a  sibling  of  5.  With  this  special 
iicatmcni,  the  agent  controlling  S  docs  not  need  to  take  any  action  to  create  the  splice. 

(This  operation,  and  new,  arc  the  only  HPC  primitives  that  can  have  a  direct,  unilateral  effect  on  remote 
structure.  Thefr  ability  to  interfere  with  the  remote  domain  is  limited  lo  the  nuisance  lc\cl  because  new  stnicPire  is 
always  created,  and  no  existing  structure  is  modified.  Creating  a  splice  or  a  new  component  therefore  can  not 
interfere  widi  any  ongoing  activities  involving  other  structure.) 

These  reserved  shell  names  arc  the  equivalent  of  wcll-kroNni  service  numbers  or  well-known  port  numbers. 
In  fact,  this  mechanism  is  taken  more  or  less  direcUy  from  the  DARPA  TCP/IP  protocol  for  establishing  a 
connccuon.  A  TCP/IP  server  listens  on  a  socket  with  a  weU-known  port  numbfj.  CUenis  initiate  connections 
between  a  local  socket  and  the  server  s  well-known  socke^  The  TCP/IP  protocol  creates  a  new  server  socket  and 
establishes  the  actual  connccuon  between  the  cUcni  socket  and  the  new  servxr  socket,  leaving  the  well-known  server 
socket  free  for  initiating  other  connections. 


It  is  unnecessary,  and  ultimately  painful,  for  every  server  to  use  a  weU-known  name  to  advertise  its  services.^ 
Here  we  demonstrate  a  a  global  name  registry  or  switchboard  service.  Any  object  can  create  a  splice  to  the 
switchboard  using  its  well-known  special  shell  and  then  send  a  message  to  register  a  component  shell  to  be  used  for 
splices.  Of  10  request  the  switchboard  to  facilitate  a  spliot  with  another  obje:L 

In  Figures  5.7  through  5.10,  a  client  is  shown  locating  a  server  and  creating  a  splice  to  it  Initially,  in  Figure 
5.7,  Servfr  has  spliced  SX  to  the  special  switchboard  shell  X.  (Dashed  curves  will  indicate  pairs  of  shells  that  have 
been  spliced.)  After  creating  the  path  to  the  switchboard.  S<rvfr  registered  (by  string  or  any  convenient  identifier) 
as  a  shell  it  is  w'Uhng  to  splice  to  a  client. 
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Figure  5.7.  5^'vcr  Registered  with  Switchboard 

In  Figure  5.8,  Client  has  spliced  its  shell  CX  to  special  shell  X.  Notice  that  a  new  sibling  of  X  is  created  for 
the  splice.  Client  can  detect  when  the  switchboard  connects  a  process  to  the  new  she?^  as  in  Figure  5.9,  due  to  the 
liveness  propeny.  At  that  time,  it  will  register  its  shell  CS  as  the  shell  it  wishes  to  splice  to  a  server. 

Any  time  thereafter,  Client  can  send  another  message  to  the  switchboard  asking  for  assistance  in  negotiating  a 
splice  to  Server,  for  which  Client  knows  the  appropriate  identifier.  The  switchboard  response  is  to  send  Server  the 
name  of  shell  CS  through  the  spliced  shell  SX,  and  to  send  Client  the  name  of  shell  SC  through  the  spliced  shell  CX. 
The  two  objects  can  then  carry  out  the  two  step  splice  operation,  as  illustrated  in  Figures  5.2  through  5.4.  The  final 
structure  is  shown  in  Figure  5.10. 

A  point  should  be  stressed  here.  The  HPC  agent  supports  promiscuous  shell  names,  and  permits  the 
switchboard  to  use  one.  Other  than  that,  all  that  has  been  described  here  is  defined  as  an  application.  The  interface 
to  the  switchboard,  the  messages  to  be  used,  the  kinds  of  strings  which  can  be  used  to  register  a  shell,  and  so  forth 


Figure  5.8,  C  licnt  Contacts  Switchboard 


Figure  5.9.  Switchboard  CommunicaiM  with  Client 


Figure  5.10.  Client-Server  Splice  Established 

have  no  unpac  at  all  on  H?C.  Tiiey  arc  established  by  convention,  and  can  be  freely  extended  and  changed. 

5J.  Discussion 

5J.1.  Transparency  is  Not  Always  a  Good  Thing 

Transparent  abstraction  fulfills  a  pragmatic  need  at  the  expense  of  the  aesthetics  of  pure  composition. 
However,  even  from  the  purely  practical  point  of  view,  hidckm  violations  of  the  hierarchy  can  have  disadvanuges. 
Ignore  software  for  a  mom  cm  and  consider  apartmem  layout  Rcsidcnis  want  hot  and  cold  fresh  water,  sewage 
dispo^l.  heating,  and  elcctnciiy  provided  without  worrying  about  where  these  services  come  from.  For  its 
occupants,  an  apinmcm  is  a  convenient  sclf<ontained  unit  independent  of  other  apartments.  However,  for 
architects,  contractors,  and  repau  persoes  the  bounds  of  a  fiven  apartment  arc  somewhat  anificiil.  They  arc  much 
riKKC  interested  in  the  riciwork  of  plumbing  and  wiring  that  lies  a  whole  building  together. 

HPC  docs  not  reconcile  these  two  points  of  view.  It  allows  construction  of  complex  objects  using  either 
explicit  provision  of  scrwice  from  above,  or  hidden  direct  access  to  services  However,  architects  and  building 
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managers  can  pkcveni  residents  from  lapping  electrical  trunl  directly,  and  are  obliged  to  provide  residents  with  the 
standard  services.  In  hierarchical  process  composition,  an  object  can  not  prohibit  a  suboDjeci  from  obtaining 
transparent  access  and  a  subobjeci  can  not  force  its  parent  to  provide  a  needed  service. 

Future  research  should  investigate  the  problem  of  controlling  hidden  access  paths  Residents  of  an  apartment 
complex  need  to  be  prevented  from  lapping  services  directly  for  lV:’0  primary-  reasons.  Pint,  resource  utilization 
must  be  controlled  and  accounted  for.  The  second  issue  is  safety;  Uncoordinated  access  to  a  resource  can  endanger 
all  users  of  the  resource.  In  large-scale  software,  similar  issues  arise.  An  object  may  need  in  restrict  or  account  for 
ihe  resources  it  consumes,  aiid  can  only  do  so  by  controlling  the  access  of  its  subobjecis.  Cycles,  storage,  and 
communication  cost  real  money,  and  most  hosts  shared  among  several  user  groups  have  some  fc:rm  of  accounting, 
billing  ard  quotas  which  must  be  observ-ed. 

As  an  example  of  the  safely  problem,  consider  the  lightweight  transaction  mechanism  proposed  by 
Zwaenepoel  and  Aimes,  which  could  be  built  easily  on  top  of  the  HPC  agent  [ZwA85j.  In  their  scheme,  each 
worker  process  pariicipating  in  a  distributed  computation  is  given  a  unique  identifier,  a  set  of  input  files,  and  a  set  of 
output  files.  A  centralized  job  manager  is  responsible  for  assigning  resources  to  workers  and  collecting  their  results. 
The  manager  gives  a  worker  process  unique  temporary  files  to  use  for  output,  so  a  worker  has  no  effect  on  shared 
data  prior  to  a  commit.  When  the  manager  decides  to  commit  a  computation,  it  (atomically)  renames  temporary' 
output  file  as  shared  data  tiles.  Since  ;  worker’s  results  are  completely  written  into  a  temporary  file  before  a 
commit,  the  actions  of  a  worker  process  appear  atomic,  If  the  manager  process  is  aborted,  the  rename  opcr?don  will 
never  be  executed,  hence  the  workers  will  have  no  effect  on  shared  data.  If  a  worker  process  is  partitioned  ihxn  the 
manager,  a  new  worker  process  can  be  created  by  the  manager  with  a  new  identifier.  The  output  files  of  the 
previous  worker  piocess  will  never  be  renamed  since  only  the  most  reccm  identifier  is  allowed  to  cause  a  commit. 
Eventually  the  orphan  worker  process  will  complete  or  abort;  In  cither  case  its  results  will  be  discarded. 

As  long  as  worker  processes  use  only  resources  obtained  from  the  manager,  this  form  of  lightweight 
transaction  works  nicely.  However,  if  worker  processes  can  transparently  bypass  the  manager  and  obtain  file  access 
directly  from  the  file  system,  the  apparent  atomicity  of  a  transaction  can  not  be  guaranteed. 

It  may  be  possible  to  exploit  the  hierarchy  when  establishing  limits  on  shells  that  may  be  spliced  together. 
The  following  sketch  aoracts  further  consideration.  An  object  could  limit  the  subtrees  (anywhere  in  the  tree)  to 
which  its  componenr  objects  may  create  splices.  That  is,  a  server  can  usuaUy  be  identified  with  an  object,  and 
therefore  a  subtree  of  the  structure  graph.  An  object  might  allow  its  components  transparent  access  to  q)ecified 
subtrees  (scivices),  or  forbid  access  to  certain  services,  i^nd  so  forth.  Existing  experience  with  scope  rules  for 
programming  languages  with  impon  aiid  export  should  be  quite  relevant 

5^2.  Peer-to-Peer  Symmetry 

Splices  avoid  the  usual  asymmetry  between  clients  and  servers.  Returning  to  the  principle  motivation  for 
transparent  non-local  access  and  thus  the  slicing  mechanism,  a  client  of  a  global  service  can  splice  a  leaf  she’!  to 
one  belonging  to  the  server  and  then  commrvucate  direcUy  with  the  server.  The  clicars  spliced  shell  leLiins  its 
structural  position  as  an  implementing  component  within  the  client  Non-locai  access  through  a  splice  appears  to  be 
access  to  a  completely  enclosed  subewnponent 
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Ii  is  a  shon  siep  lo  realize  ih-;  'Jie  server  can  think  of  a  client  as  a  subcomponent  readily  as  the  other  way 
around.  This  is  an  extension  of  the  parent-child  symmetr>'  made  apparent  by  the  dual  graph,  and  it  holds  even  in  the 
hierarchical  view.  Splices  provide  naturally  for  peer-peer  relationships  as  well  as  the  more  common,  but  limited, 
master-slave  organization, 

The  dual  graph  displays  even  greater  symmetry.  An  object  is  simply  a  subtree  and  the  abstraction  it 
implements  is  the  shell  at  its  root  The  global  process  structure  can  be  broken  into  subtrees  at  any  space  and  the 
composition  inside  tiie  spacj  defines  the  global  behavior  as  a  function  of  the  subtree  behaviors.  Because  there  is  no 
distinction  between  parent  and  child  in  the  dual  graph,  a  subtree  can  treat  an  incident  shell  (representing  its  parent) 
as  a  logically  subordinate  object,  but  the  parent  can  treat  the  other  side  of  the  shell  (representing  the  child)  the  same 
way.  It  is  all  a  matter  of  perspective;  the  root  of  the  dual  graph  can  be  chosen  arbitrarily. 

5.53,  More  iind  Fewer  Restrictions 

The  strict  tree  structure  could  be  retained  by  simulating  the  splice  operation  outside  the  HPC  system. 
Specifically,  instead  of  spbeing  two  shells,  an  agent  could  animate  an  appropriate  process  in  an  empty  shell,  passing 
parameters  lo  indicate  me  shells  to  be  spliced.  Processes  of  this  type  would  interact  outside  the  HPC  system, 
locating  processes  with  complementary  argumenf.s,  and  forwarding  communication  from  one  process  to  the  other. 
(Splices  between  more  than  two  endpoints  could  be  simulated.) 

As  v^'ith  multicasting,  we  are  obligated  to  demonstrate  how  imponam  relationships  can  be  directed  expressed 
in  terms  of  process  structure.  Therefore,  we  integrated  splices  into  the  object  hierarchy  to  express  transparent,  non* 
hierarchical  access  as  directly  as  is  consistent  with  protection  and  visibility  restrictions,  rather  than  depend  on  a 
mechanism  outside  the  system. 

Instead  of  maintaining  a  canonical  spanning  tree  of  shells,  with  all  other  edges  distinct  splices,  parts  of  the 
tree  could  be  left  indeie:  ,’ninaie  until  a  hierarchical  operdtion  xtually  affects  that  structure.  This  option  has  not 
been  carefully  explored,  but  it  l  as  some  attractions.  Any  system  that  maintains  a  canonical  tree  must  break  all 
symmetries  in  the  complete  graph.  Any  time  fwo  spaces  share  mulriplc  parallel  edges,  one  edge  must  be  identified 
as  pan  of  the  nee.  Delaying  this  identification  until  needed  permits  greater  symmetry.  It  is  probable  that  this  would 
reduce  the  number  of  merge  confiicrs  (Chapter  6)  that  must  actually  be  reconciled. 

It  is  possible  to  give  clients  an  explicitly  different  itpresentation  for  splices,  instead  of  disguising  them  as 
opaque  shells.  However,  this  puts  a  greater  burden  on  agents  by  adding  structural  features  and  operations  they  must 
unde^o^d,  and  the  overall  functionality  (thus  system  complexity)  remains  the  same. 

In  the  direction  of  accepting  still  less  restricted  graphs,  some  merge  tnoonsisiencies  could  be  avoided  by 
giving  up  trees  (and  even  DAGs)  and  using  general  directed  graphs  or  even  the  undirect  dual  grriph  as  the  basic 
structure.  Communication  and  protection  stnicoi'^c  could  carry  over  essentially  unchanged,  but  a  replacement  for 
the  asymmetric  privileges  of  superior  domains  over  inftnor  ories  would  have  to  be  found.  The  basic  methodology 
of  nested  abstractions  would  be  lost,  and  that  seems  like  too  great  a  cost  for  too  little  return. 

53.4.  Implicit  CompositioLi 

Instead  of  relaxing  the  strictness  of  the  hierarchy,  we  could  have  relaxed  explicit  composition.  For  example, 
programming  language  open  scope  rules  avoid  the  clutter  associated  with  explicit  configuration  of  all  related 
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componeni5,  especially  shared  access  ‘o  globally  exported  modules. 

These  default  rules  are  not  appropriate  for  systems  where  configurations  are  to  be  inspected  and  inciementally 
modified  dunng  e.xecuiion.  Sorting  through  all  the  implicit  compositions  for  the  ones  actually  used  by  an  object  is 
not  an  efficient  technique  for  determining  its  actual  configuration.  Splices  (and  shells)  identify  the  interfaces 
actually  in  use  by  an  object,  not  all  the  potential  interfaces.  Promiscuous  splices  are  a  convenient  analog  to  global 
exports  when  they  are  really  desired,  while  ordinary  splices  allow  greater  control  over  configuration. 

It  also  seems  implici'  composition  would  complicate  communication  structure  substantially,  by  the 
introduction  of  default  rules  for  configuration,  and  by  the  extension  chains  with  an  third  form  of  binding  between 
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6.  Partition  and  Consistency 

A  well-formed  HPC  struciiire  satisfies  a  number  of  constraints.  For  example,  an  endpoint  may  be  part  of  at 
most  one  connection,  a  space  must  belong  to  exactly  one  domain,  and  the  directed  edges  (shells)  must  comprise  a 
strict  tree.  Under  normal  circumstances,  HPC  primitive  operations  preserve  these  constraints,  transforming  well- 
formed  structures  into  well-formed  structures.  Enforced  preconditions  on  HPC  primitives  prevent  invocations  that 
would  result  in  ill-formed  structure. 

In  general,  a  distributed  application  subject  to  partition  and  merge  can  not  offer  completely  consistent  service. 
It  may  provide  serv  ice  that  is  consistent  over  time  for  each  client  or  service  that  is  consistent  over  all  clients  (in  a 
partition)  at  any  given  time.  Stated  another  way,  a  distributed  service  must  decide  between  discontinuous  service 
over  lime  or  over  clients.  To  give  every  client  the  same  service,  interactions  with  specific  clients  will  be  disrupted 
as  previousl)  partitioned  states  are  reconciled  and  service  is  resumed  on  the  basis  of  the  new  state.  The  service  may 
avoid  these  disruptions  by  retaining  the  previously  partitioned  states  and  serving  clients  differently  according  to 
their  differing  histories.  A  simplifying  compromise  is  more  common.  A  distributed  service  usually  maintains  a 
single  canonical  state  and  allows  access  to  clients  in  at  most  one  partition  (chosen  by  a  quorum  of  resources).  No 
matter  which  choice  is  made,  the  specification  of  a  service  must  define  the  allowable  inconsistencies  over  time  and 
between  clients. 

When  a  partition  occurs,  each  partition  inherits  the  pre-partition  structure,  and  subsequent  operations  can  be 
checked  for  soundness  within  each  paiiition.  Obviously  it  is  not  possible  to  evaluate  preconditions  on  structure  that 
may  have  been  created  or  modified  in  otiter  partitions,  and  locally-sound  operadcT's  .T^y  produce  structure  that  is 
inconsistent  between  partitions,  While  a  partition  lasts,  these  inconsistencies  are  of  no  practical  significance 
because  (by  definition  of  partition)  they  cannot  be  detected.  When  a  merge  occurs,  however,  a  well-formed 
structure  must  be  reestablished. 

Thcic  arc  three  basic  strategics  for  dealing  with  inconsistencies  due  to  merge.  Avoidance  restricts  service  so 
that  inconsistencies  at  merge  time  arc  prohibited.  Reconciliation  combines  several  stales  into  a  single  state 
algonihmically,  pwssibly  with  a  change  in  service.  Reporting  presents  the  inconsistencies  to  clients  explicitly, 
permitting  client  resolution.  All  three  techniques  have  an  impact  on  system  design. 

Most  distributed  database  consistency  control  techniques  involve  avoidance  ISLR76),  [BeGSl],  (BcG84]. 
Generally,  the  database  can  be  written  in  at  most  one  partition,  ensuring  the  existence  of  exactly  one  authoritative 
"most  recent"  version.  Avoidance  is  appiopriaie  when  the  system  does  not  understand  the  structural  constnints  that 
must  be  preserved  (the  appiiciiion  semantics  of  entries  in  a  database),  and  when  clients  cannot  loleraie 
asynchronous  structural  change  (as  in  ransiction  systems).  AvoxUnce  by  denying  service  is  inappropriate  for 
systems  offering  high  availability,  for  applications  that  wish  to  apply  their  own  consistency  control  policies,  and  for 
environments  with  a  high  expected  frequency  of  partition. 

ITjc  Locus  file  system  allows  updates  to  partiiicmed  files  in  all  partitions  [PWC81J,  (WPE83).  Two  fonns  of 
resolution  ere  appbed  to  inconsistent  files.  A  history  of  updates  and  partitions  (called  a  version  vector)  is  used  to 
replace  inconsistent  copies  of  arbiirary  files  with  a  dominating  update,  if  any  exists.  Remaining  inconsistencies  in 
specialized  files,  such  as  nuiJ  boxes,  are  resolved  by  merging  the  partitione^J  contents  of  a  file.  Netwtxk  clock 
synchronization  protocols  that  maintain  a  distributed  monotone  value,  with  fixed  upper  and  lower  bounds  on  its  rate 
of  change,  arc  good  examples  of  complex  resolution  algorithms  [Lam781.  (Mir84).  lRay87].  fWeLSS).  Rrsolution 
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is  appropnaie  when  the  system  understands  the  constraints  that  must  be  prescr\'ed  (e.g.,  an  update  dominates  a 
previous  update  on  the  same  file,  the  behavior  of  a  clock),  and  clients  can  tolerate  asynchronous  changes. 

Resolution  is  not  alw-ays  possible,  how'cver.  For  example,  in  i.xx:us  there  is  not  always  a  dominating  update  to 
a  file.  For  arbitrary  files.  Locus  cannot  resolve  such  inconsistencies  in  a  principled  way,  and  reports  them  to  the  file 
owner.  The  inconsistent  copies  and  their  version  vectors  are  made  available  so  the  owner  can  apply  an  arbitrary’ 
resolution  policy.  Locus  suspends  normal  access  to  the  file  until  the  owner  installs  the  definitive,  resolved  copy. 
Reporting  moves  much  of  the  burden  of  dealing  with  inconsistency  from  the  system  to  its  clients.  All  possible 
system  states,  both  ill-formed  and  well-formed,  must  be  defined,  the  behavior  of  its  operations  must  be  defined  on 
ill-formed  stales,  and  clients  must  have  tools  that  can  move  ill-fonncd  stales  closer  to  wcU-iormed  ones. 

HPC  client  applications  chose  their  own  responses  to  partition  and  failure.  Applications  that  aggressively 
adapt  to  such  events  can  radically  restRiciure  themselves  to  restore  lost  resources  and  recover  the  desired  degree  of 
redundancy,  but  almost  all  applications  will  modify  some  aspect  of  process  structure  during  a  lengthy  partition. 
HPC  uses  all  three  basic  strategies  to  deal  with  the  various  types  of  inconsistent  process  structure  that  can  arise 
during  merge.  The  use  of  unique  identifiers,  globally  known  functions,  and  immutable  properties  avoids 
inconsistency  in  many  HPC  features.  Section  6.1  shows  how  HPC  uses  these  techniques,  w'hich  are  not  specific  to 
HPC  and  can  be  used  to  advantage  in  many  disnibuted  applications. 

Applying  avoidance  techniques  leaves  a  small  number  of  str.iciural  features  for  reconciliation  and  reporting 
(Section  6.2).  HPC  reconciiation  is  guided  by  a  preservation  principle:  anything  that  works  or  may  be  even 
passively  of  interest  to  an  agent  in  any  partition  should  be  preserved  in  a  merger  involving  that  partition.  The 
mathematioai  operation  of  meet  on  a  lattice  can  often  be  u.«cd  to  merge  partitioned  structures  while  preserving  their 
individual  behaviors,  and  HPC  uses  several  special  cases  of  meet  for  reconciliation.  However,  some  divcrgcr«i 
structures  are  simply  incompatible  and  can  not  be  merged  naturally.  HPC  ensures  that  such  confliciing  structures  do 
not  interfere  with  consistent  parts  of  the  process  structure,  reports  the  conflicts  to  the  relevant  agents,  and  gives 
agents  the  tools  needed  to  reduce  an  inconsistent  state  into  a  consistent  one. 

6,1.  Avoidance 

Change,  sharing,  and  partition  are  the  major  ingredients  in  the  recipe  for  inconsistency.  Any  feature  of  the 
system  dial  is  immutable  may  be  known  uniformly  throughout  the  system  without  the  need  for  observation.  In  the 
absence  of  partition,  all  observations  may  be  made  consistent  by  globally  simulating  a  single  siu  using  welt  known 
techniques,  such  as  serialization.  In  the  absence  of  sliaring,  all  observations  are  trivially  consistent  because  only  one 
observer  may  look  at  a  given  piece  of  the  system.  Partition  can’t  always  be  avoided  (it  is,  after  all.  a  failure  mode), 
but  the  degree  of  change  and  sharing  can  be  reduced  as  pan  of  an  avoidance  strategy. 

There  arc  conflicts  about  piivilegc  and  authority  that  HPC  can’t  merge  v/hilc  pitaerving  the  pre-merge 
behavior  and  can’t  trust  clients  to  reconcile  (or  themselves.  Because  HPC  can  neither  reconcile  nor  repon  these 
problems,  they  must  be  avoided.  By  careful  design,  additional  inconsistencies  in  HPC  proce.5S  structure  can  be 
avoided,  simplifying  both  the  (static)  HPC  interface  to  clients  and  the  run-time  merge  pfxxsdures.  while  itiaining 
complete  availabilitv*  of  HPC  operations  during  partition.  These  simjrfifications  arc  sought  for  their  ou  amke. 

Despite  the  emphasis  on  dynamic  change  of  structure,  it  was  possible  to  design  HPC  so  that  most  properties 
and  structural  relations  are  immutable,  completely  avoiding  the  possibiiity  inconsistent  updates.  (Appendix  A 
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gives  HPC  siruciure  a  formal  descripiion  in  terms  of  sets,  relations,  and  predicates  on  those  malhemaiical  objects. 
We  will  refer  to  some  of  them  in  this  Chapter.)  The  role,  type,  and  structure  of  views,  the  interfaces  of  an 
abstraction,  and  tiie  controller  for  a  domain  are  prominent  fixed  properties.  Less  obvious  examples  are  the  pair  of 
views  that  comprise  a  shell,  and  the  parent  of  a  component  view-.  (A  viev,'’s  known  children  nay  change,  but  not  its 
parent.) 

Inconsistent  changes  can  also  be  eliminated  by  leaving  properties  unconstrained,  or  explicitly  defining  sets  of 
states  to  be  equivalent  HPC  uses  at  least  four  variations  of  this  technique.  First  of  all,  HPC  is  history-less.  Legal 
operations  on  a  structure  depend  solely  on  its  current  state,  and  the  sequence  of  operations  that  created  it  is  both 
unknown  and  irrelevanL  Second,  the  hierarchical  relation  is  the  only  dynamic  ordering  in  HPC.  All  other  dynamic 
sets  (e.g.,  multiplex  or  multicast  views,  shells  adjacent  to  a  space)  are  unordered,  and  all  other  orderings  (e.g.. 
bundle  components,  terminal  policies)  are  immutable.  Third,  there  are  few  upper  bounds  on  numbers  of 
dynamic- Jly  created  structures.  Partitions  that  individually  satisfy  an  upper  bound  can  easily  violate  the  bound 
when  merged,  while  this  cannot  happen  with  lower  bounds.  Fourth,  connections  and  spaces  are  unnamed,  and 
defined  by  the  viev.s  they  relate.  Making  them  first-class  entities  would  increase  redundancy,  and  the  opportunities 
for  inconsistency,  without  adding  to  the  possible  structures. 

HPC  uses  three  techniques  to  reduce  sharing.  Physical  resources  are  literally  tangible;  they  have  real  physical 
locations,  and  can  be  examined  and  modified  only  if  they  are  in  the  current  partition.  Instead  of  attempting  to  share 
physical  resources  between  parucions,  HPC  reports  partitioning  of  processes  and  micrproccss  communication  media 
via  suspended  iiveness  of  affected  communication  paths.  Suspension  avoids  inconsistencies  by  making  the  status 
of  partitioned  resources  explicitly  indeterminate,  and  therefore  consistent  with  any  state.  The  suspended  terminal 
policy  allow’s  agents  to  generalize  this  behavior  to  multi-process  abstractions  that  do  not  have  any  specific  physical 
location.  Third,  globally  unique  names  arc  created  for  each  new  piece  of  structure,  so  that  similar  operations  in 
different  partitions  will  create  different  structure,  rather  than  have  conflicting  effects  on  the  same  structure.  These 
unique  names  include  the  name  of  the  generating  host,  so  r^amc  creation  docs  not  need  to  be  synchronized. 

The  visibility  constraints  imposed  by  the  protection  domain  system  also  reduce  sliaring.  First,  disjoint 
domains  drastically  limit  the  visible  effects  of  an  operation.  Only  the  new  opention  and  promiscuous  splices  have 
direct  effects  in  two  domains;  the  livencss  propeny  also  propagates  across  domain  boundaries.  Second,  the  HPC 
system  is  free  to  hide  information  from  its  clients,  especially  non-hierarchical  stniciurc.  If  the  exact  relations 
between  domains  were  visible  to  clients,  Uie  facade  of  a  strict  hierarchy  could  not  be  maintained. 

6.1.1.  Domain  Contents 

The  contents  of  domains  arc  formally  described  by  the  relation  Msrtjpriv,  d),  where  structural  element  v  is  a 
member  of  domain  d  (Because  shells  hav.*  been  equated  to  of  views,  and  both  spaces  arKS  connections  can  be 
formally  defined  in  terms  of  views,  we  can  treat  views  as  the  only  bnd  of  sbiictural  dement)  It  is  a  critical 
property  of  the  protection  system  that  any  view  is  a  member  of  at  most  one  domain.  There  is  no  way  to  express 
membership  in  multiple  domains,  there  is  no  way  to  designate  an  inieUigenu  neutral  pany  to  arbioite  conflicts 
between  agents  of  different  domains,  and  the  cooperation  of  agents  cannot  be  assumed  in  rmusers  as  ciudal  as 
protection,  control,  and  authorization.  For  these  reasons,  membership  in  multiple  domains  must  be  avoided 

During  a  partition,  one  agent  for  a  domain  old  that  existed  before  the  partition  started  could  invest  a  new 
domain  r»  #  in  a  subtree  containing  view  v.  Suppose  that  v  is  mov<*d  from  old  U)  No  matter  what  agents  in  other 
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pariiiions  do,  when  tlic  panilions  are  merged  together,  v  will  belong  to  new  and  at  least  one  other  domain.  If  they  do 
nothing  to  v,  it  will  belong  to  cIcl  If  they  create  a  different  new  domain  through  another  invocation  of  invest,  v  will 
belong  to  the  new  domain.  (There  is  no  way  agents  m  different  partitions  can  independently  create  the  same  new 
domain.) 

These  violations  of  the  constraint  are  avoided  by  making  the  domain  of  a  view  an  immutable  propert>’  fixed 
when  the  view  is  created.  When  domain  creation  and  destruction  transfer  stnicture  between  domains,  a  distinct 
copy  is  created  in  the  new  domain,  isomorphic  to  the  affected  structure,  then  the  original  structure  is  deleted. 
Another  interpretation  is  that  the  affected  structure  is  renamed,  because  the  HPC  system’s  internal  representations 
of  abstract  structure  can  be  modified  in  place. 

An  object  seldom  will  be  an  agent  for  bo*h  the  old  and  the  new  domains.  When  only  the  old  or  the  new  is 
visible,  the  exact  isomorphism  between  them  is  not  critical.  However,  there  will  be  occasions  when  the 
correspondence  is  useful  o;  necessar>’.  It  can  be  easily  preserved  by  structuring  the  name  space.  An  HPC  unique 
identifier  consists  of  a  (unique)  domain  field  and  a  (unique)  element  field.  When  a  view  is  renamed,  only  the 
domain  field  Ls  altered.  This  also  reduces  the  consumption  of  unique  name  space,  as  only  one  new  name  is  required 
to  rename  arbitrary  subtrees. 

6.1.2.  Characteristics  and  Immutable  Relations 

Renaming  is  a  basic  technique  for  treating  a  fixed  set  of  things  with  dynamic  properties,  as  a  d>mamic  set  of 
things  with  fixed  properties.  It  avoids  fatal  inconsistencies  (such  as  conflicts  about  privileges)  by  introducing  less 
dangerous  ones.  For  example,  a  view  may  exist  in  some  partitions  but  not  in  others.  However,  creating  and 
destroying  views  produces  exactly  the  same  effect,  so  renaming  docs  not  introduce  a  new  problem.  HPC’s 
preservation  principle  resolves  prcsencc/abscncc  conflicts  during  merge  by  retaining  views  present  in  any  merged 
partition.  Specifically,  that  means  that  a  view  that  was  deleted  (or  renamed)  in  one  partition  may  be  resurrected  in  a 
later  merge. 

Creation,  destruction,  and  renaming  of  views  and  related  structure  within  the  current  paniuon  are  dynamic 
operations,  but  HPC  formal  consistency  is  simplified  by  treating  the  set  of  views  and  the  relationships  among  them 
as  immutable.  The  technique  of  characieristics  and  immutable  relations  used  in  this  formal  trcaimcni  is  not  specific 
to  HPC  and  can  be  applied  to  many  data  structures  that  arc  dynamically  crcaied  and  destroyed  but  otherwise  have 
fixed  properties  throughout  their  lifetimes. 

The  HPC  process  structure  in  the  current  partition  is  a  subset  of  the  structure  in  entire  environment  The  local 
structure  is  knowm  exactly,  but  the  non-local  structure,  by  definition,  can’t  be  Imown.  so  we  can  assume  anything 
about  the  global  stn^iurc  that  is  consistent  with  the  local  structure  as  a  nibset  Specifically,  we  assume  the  global 
structure  is  immuuble.  and  define  the  structure  within  a  panieion  as  the  intersection  of  the  fixed  global  structure 
iih  a  dynamic  local  characicnstic  function  (or  set)  that  defines  the  views  and  other  feauires  known  in  the  partition. 
As  views  are  created  and  destroyed,  they  are  added  to  and  removed  firom  the  local  chiracieristic.  but  the  global 
structure  remains  unchanged. 

When  this  definition  can  be  applied,  an  especially  simple  merge  procedure  is  possible.  Partitions  can  be 
merged  by  taking  the  set  union  of  the  Jocal  characteristics  and  the  local  structures.  This  preserves  the  defined 
relationship  among  the  resulting  characteristic,  local  strucutre,  and  global  structure.  A  proof  of  the  consisicrKv  of 
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this  merge  procedure,  and  other  advantages  of  the  technique,  are  given  in  Section  A.l. 

HPC  structural  features  that  have  fixed  properties  throughout  their  lifetimes  can  be  defined  this  way.  The 
role.  type,  and  structure  of  a  view  are  such  fixed  propenies.  as  is  domain  membership.  Let  the  characteristic 

{.'  define  the  vic\vs  known  in  the  current  parution,  and  let  ^^^rerberiv,  di  define  the  domains  they  belong  to.  As 
each  vicu  IS  created,  it  is  added  to  die  local  and  its  permanent  domain  is  added  to  the  local  c-Terrer.  WTicn  a 
view  is  destroyed,  it  is  removed  from  and  c-norcer.  The  global  rwrbertv,  d)  is  a  similar,  immutable  relation 
defined  for  all  views,  all  time,  and  all  paruuons.  By  defining  o-fm±er  as  the  intersection  of  and  neroer,  we  are 
assured  that  taking  the  union  of  several  p.artitions’  o-view’s  and  e-flwrber’s  is  consistent  with  the  definition.  Writing 
down  ri-TDer  (  .,  c  w  ould  require  full  knowledge  about  the  future,  but  we  don’t  actually  have  to  compute  it.  Instead, 
it  can  be  '  virtual  :  The  identifier  or  domain  of  a  view  is  never  needed  until  it  is  created  or  made  available  through 
merge.  Destroying  a  viev.  simply  removes  it  from  the  known  local  structure.  If  it  has  not  been  removed  from  all 
panitions.  it  may  become  known  again  after  a  merge,  consistent  with  HPC’s  preservation  principle.  Other  fixed 
view  properties  are  treated  the  same  way. 

View  hierarchies  are  a  more  interesting  example,  because  the  components  of  a  view  may  change  dynamically. 
The  chdractensiic  is  again  while  the  rclaoon  oerponent  (c,  p)  holds  when  c  is  a  component  view  of  iLs  parent  p. 
Here,  a  technical  constraint  on  the  arguments  of  the  global  relations  is  very  important.  When  we  create  a  view  we 
certainly  know  its  ancestors,  but  certainly  not  all  its  (as-yci  uncreated)  descendants.  Therefore,  we  will  insist  that 
the  second  argument  of  a  local  rclauon  is  characicnsuc  whenever  the  first  argument  is  characteristic.*®  For  example, 
the  apparently  e<]uivaleni  parer.iip,  c)  can  noi  be  used  as  a  local  relation,  because  we  can  not  know  all  future 
descendant;  of  any  currently  known  view. 

Use  of  the  characicnsuc  technique  mieracis  with  other  HPC  design  options,  c.g.,  if  deleteing  a  view  caused  its 
components  to  be  inherited  by  its  parent  rather  than  destroyed,  the  global  oerpawt  relation  could  not  be  immutable. 
Tlie  txxnc  rclauon  that  defines  ducct  pnvaic  peer  bindings  (from  shells  and  splices)  is  constrained  similarly.  When 
one  viev.  of  a  shell  is  renamed,  both  vicw.s  must  be  renamed  to  keep  ha«i  immutable  and  single -valued.  That  means 
that  opaque  doirain  boundancs  may  'spontaneously”  rename  even  when  the  domain  on  the  visible  side  is 
unchanged-  This  in  turn  implies  that  the  domain  field  of  a  structured  HPC  identifier  does  not  suffice  when  renaming 
domain  boundaries. 

6.1J.  Splices 

As  described  previously,  the  splice  operation  is  a  cooperative,  two-step  operation.  In  a  ocntralizcd 
enviroriment  this  offers  no  difficulty,  but  there  are  many  opportunities  for  confusion  in  •  ptrtitionable  cnvironmenL 
The  partial  ordering  of  distributed  events  in  separate  panitions  can  make  “first"  and  "second*  steps  undefinable. 
Inconsistent  steps  can  be  taken  in  different  partitions,  which  are  expoaed  at  a  lacer  merge.  Different  numbers  of 
consistent  steps  can  be  taken  in  different  panibons.  causing  inconsistent  merges.  Because  the  other  HPC  operabons 
are  one-step  and  effecUvely  atomic  within  a  single  domain,  the  technique  used  to  avoid  inconsistencies  with  splice 
merits  a  detailed  examiruibon. 

”  Reiauont  «rtih  ihu  pre^ny  lomeiifnet  oikd  stria! 
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Both  first  and  second  splice  steps  take  an  empty  shell  and  create  a  new  domain  boundar>'.  Call  the  local  and 
remf;te  shells  in  the  first  step  cIs-l  and  oiQ-r>,  respectively.  To  keep  track  of  associated  first  and  second  steps,  there  is 
tin  implicit  relauon  bcv.vccn  the  old  and  new  views.  Let  prespiioed(c:a-s-.  cIo-r.  new^.i  relate  the  arguments  of 
splice(old-L,  old-R»  with  the  hidden  viev.  created  to  replace  old-?,.  An  invocation  of  splice  will  execute  the  first 
step  if  there  is  no  tuple  with  its  arguments  reversed  in  the  prespiioed  relation,  and  complete  the  splice  using  the 
hidden  r«w-?  otherwise.  To  be  w-ell-behaved,  prespliced  must  identify  a  unique  new  view  for  a  given  pair  of  old 
views,  and  identify  no  view  if  the  old  views  have  not  been  prespliced. 

Without  restrictions,  partition  will  violate  these  constraints.  A  first  step  earned  out  in  two  partitions  could 
create  both  presp::a:^i(clQ-l.  do-?,  X)  and  prespiioed loid-L,  oid-f.,  Y).  After  ?  merge,  will  splice(old-R, old-L) 
complete  a  splice  using  >:.  v.  or  both?  A  first  step  carried  out  in  only  one  partition  could  lead,  after  a  merge,  to 
pre=iicec(do-L,  do-?,  x;  at  the  same  lime  Uiai  oid-L  is  in  the  partition.  Will  splice(old-R,  old-L)  complete  a  splice 
using  X  or  take  the  first  step  in  splicing  to  oIq-l?  The  likelihood  that  agents  for  both  ends  carried  out  their  splice 
while  partitioned  complicates  this  scenano,  making  the  operations  first  steps  in  opposite  directions,  rather  than  a  first 
and  a  second  step.  Do  two  opposing  first  steps  complete  a  splice? 

Fiat  removes  some  of  the  ambiguit>-.  In  HPC  a  prcspliccd  shell  lakes  precedence  over  an  unspliced  shell  in  a 
second  step,  by  definition.  However,  multiple  first  steps  in  one  direction,  Ejig  opposing  first  steps,  remain  a 
problem. 

A  paniiioncd  second  step  following  a  first  step  that  establishes  prespiioed  (oio-d  do-?,  x)  is  less  awkw-ard  We 
can  determine  the  identifiers  for  the  complete  splice  boundtnaw-L,  new-w  during  cither  the  first  or  second  steps,  with 
differing  results.  The  first  step  can  establish  x  as  rew-L,  and  fix  the  binding  between  nev^  and  new-?  even  though  it  is 
not  used  until  the  second  step,  which  would  always  replace  oid-«  with  ne^-fc  Identically  named  structure  is  identical, 
so  this  produces  one  splice,  even  when  the  second  step  is  executed  in  multiple  partitions.  The  alternative  is  to 
determine  during  the  second  step.  The  first  step  would  replace  old-L  with  x,  which  would  be  bound  to  some 
hidden  view .  and  the  second  step  would  replace  x  wi'Ji  new-L  at  the  same  time  that  old-?  is  replaced  with  new-K 
Executing  the  second  step  in  multiple  partitions  would  produce  distinct  splices  because  each  partition  would  select 
unique  valuer  for  -*>^1  and  The  effect  is  as  if  both  steps  had  been  executed  while  paniaoned. 

There  arc  three  general  ways  to  avoid  the  remaining  inconsistencies  in  the  prespiioad  relation:  enforce  a 
centralized  decision,  enforce  a  consistent  distributed  decision,  and  accept  lower  standards  of  copisisiency.  Four 
specific  mechanisms  were  considered  for  the  HPC  design. 

•  Synchronous  splice 

If  the  two  steps  in  splicing  were  splice(old-L,  dd-R):  spU€e(old-Rt  new-L).  the  operation  would  be  synchronous, 
delaying  (he  second  step  until  the  results  of  the  first  step  are  available.  The  agent  taking  the  second  step  explicitly 
resolves  any  ambiguity.  Besides  introducing  asymmetry  to  the  agents  negotiating  the  splice.  mechanism  adds 
another  step  to  the  ncgoiiauon.  because  the  first  agent  must  communicate  ne«-L  to  the  second. 

•  Issue  a  ticket 

Another  way  to  enforce  a  centralized  decision  would  be  to  ask  the  HPC  agent  to  precompute  and  and 
issue  a  ticket  that  associates  all  four  affected  views.  Splice  would  then  take  a  ticket  and  the  local  view  as  arguments 
rather  than  two  views.  This  would  require  another  basic  operation  for  ticket  creation,  but  has  some  attractions. 
While  the  ticket  buyer  must  know  both  riews  to  be  spliced,  the  ticket  user  need  only  know  the  view  in  its  domain. 
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This  adds  a  liule  modularity.  The  implicit  relation  of  ticket  4~tuplcs  is  obviously  an  immutable  (virtual)  relation. 

It  is  tempung  to  use  tickets  for  splices  that  are  not  tied  to  specific  target  views.  Tickets  could  be  created 
independentlv  of  the  viev.  s  to  be  spliced,  and  passed  agent  to  agent.  However,  like  other  capability  schemes, 
there  would  be  a  insoluble  usc-once  problem  in  a  panitionable,  hilly-available  environment 

•  Accept  multiple  pnvatc  bindings 

The  acceptable  structures  could  be  extended  to  allow  a  second  step  to  splice  a  view  to  more  than  one  other  view  at  a 
lime.  There  is  nothing  fatal  about  allowing  a  view  to  be  bound  to  more  than  one  view,  although  we  chose  not  to  do 
so.  The  efteci  of  multiple  private  peers  on  communication  patterns  is  well-dc6ned:  multicasting.  Multiple  binding 
might  also  st  >c  as  a  way  to  accept  the  effects  of  multiple  ticket  uses.  Hov'ever,  these  multiple  bindings  would  be 
completely  hidden,  unlike  the  multicasting  introduced  by  complex  view's.  Tliis  seems  undesirable,  especially  if 
some  media  may  not  be  multicast. 

•  Known  function  on  structured  names 

Distributed  agents  will  reach  consistent  conclusions  if  they  apply  the  same  deterministic  algorithm  to  the  same  data. 
A  pairing  function  globally  and  consistently  can  compute  unique  identifiers  as  a  function  of 

(oid-L,  oid-F).  Appending  the  old  names  or  interleaving  ihcu  bits  are  typical  pairing  functions.  However,  a  pairing 
function  requires  some  structure  in  the  name  space  to  avoid  generating  "fresh'  identifiers  that  might  be  given  to  a 
completely  new  structure.  Also,  a  paired  number  has  all  the  bits  of  its  inputs,  so  the  name  space  must  be  big  enough 
to  contain  the  largest  result.  These  conditions  arc  met  in  HPC.  A  (prc)spliccd  shell  can  not  be  spliced  again  since  it 
is  not  an  empty  shell,  so  a  pairing  function  will  be  applied  only  to  fresh  names.  (\^'hen  the  splice  is  removed,  a 
completely  fresh  name  is  assigned  to  the  empty  shell.) 

respite  the  sparse  use  of  name  space,  HPC  uses  this  technique  because  it  is  fundamentally  distributed  and 
preserves  the  desired  symmetry'  and  asymehrony  of  the  splice  operation.  It  also  has  the  side-effect  of  creating  one 
splice,  even  when  the  affected  views  are  spliced  in  multiple  domains,  or  when  the  both  ends  arc  spliced  in  srparaic 
partitions.  The  ucket  mechanism  also  has  these  nice  effects  if  a  single  ticket  is  used,  and  avoids  wasting  any  i^aine 
space. 


62,  RecoDcUiation  and  Reporting 

Through  careful  design,  most  dynamic  aspects  of  HPC  process  structure  can  be  treated  as  partial  local 
knowledge  about  immutable  global  structure,  using  the  technique  of  characicrisacs.  These  structural  features  can  be 
reconciled  upon  merge  simply  by  taking  the  set  union  of  (he  dau  in  each  partition.  Set  union  is  one  specific 
function  in  the  general  class  of  mteis  over  lairices  of  values  (vid..  Chapter  6  of  {StoT71).  The  meet  operation  has 
some  highly  desirable  propenies  for  reconciliiiion.  It  is  stable,  idempotcni.  and  convergent;  merging  any  number  of 
partitions  any  number  of  times  in  any  order  produces  the  same  ultimate  rcsulL  The  preservation  principle  we 
adopted  to  guide  reconciliauon  can  be  obtained  by  careful  choice  of  lattice.  Max  and  min  am  other  examples  of 
meets  over  appropriate  Unices. 

There  arc  only  three  relations  in  the  formal  description  of  HPC  structure  that  can  not  be  handled  by 
characiemiics.  Merges  of  these  features  are  handled  by  specialized  meets  for  reconctliiuon.  with  explicit  reporting 
of  cases  where  a  structuic-prcscrvmg  meet  couid  not  be  chosen.  Poiicyia.  pi.  defines  the  temporary  and 
permanent  terminal  policies  for  a  domain.  Connections  arc  defined  by  the  symmetric  rclauon  <xrc*ci^w:.  v2).  and 
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spaces  are  defined  by  ihe  equivalence  classes  of  the  relation  ac^aoen:  u-i,  v2, ,  which  identifies  pairs  of  views  incident 
on  a  common  space. 

6.2.1.  Terminal  PoIic> 

There  can  be  only  one  terminal  policy  of  each  type  for  a  domain.  Within  a  partition,  the  most  recently 
specified  policy  is  used,  but  different  policies  can  be  specified  in  different  partitions.  (Once  again,  partial  ordering 
of  events  in  different  panitions  mean.s  that  the  "last"  modification  is  not  defined  during  a  merge.)  Inconsistencies  in 
policy  cannot  be  reported  to  the  client.  becau.j  the  policy  specifies  what  to  do  when  there  is  no  cl>nt  r-TX)rt  to.  A 
merger  may  continue  a  temporary  loss  of  control  or  reveal  a  permanent  loss  of  control,  so  a  wtll  defined  policy 
mu.st  be  available  immediately. 

Reconciliation  uses  a  non-trivial  example  of  meet  over  a  lattice.  Because  IIPC  understands  what  terminal 
policies  mean,  it  can  resolve  inconsistencies  according  to  a  sensible  set  of  priorities.  For  example,  suspend  prevents 
unauthonzed  interactions,  while  null  does  not,  and  die  hides  structure  from  the  outside  domain,  while  abdicate  docs 
noL  Our  prioribes  in  descending  order  are  to  conceal  structure,  exen  control,  and  keep  running.  Accordingly,  the 
basic  policies  are  parbally  ordered: 

suspend/animate  <  null  <  die  <  abdicate. 

where  suspend  is  incomparable  with  animate,  and  animates  with  different  parameters  are  incomparable  with  each 
other.  Policy  sequences  arc  partially  ordered  lexicographically,  taking  the  default  basic  policy  as  the  first  element 
(the  reverse  of  the  order  in  which  basic  policies  are  applied.)  For  example: 
die  <  die  suspend 
abdicate  suspend  <  abdicate  die 
...  tQimate(x)  is  incomparable  with  ...  intmatefy) 

...  animate(z)  <  ...  null 

The  meet  on  this  lattice  is  the  greatest  element  less  ilum  or  equal  to  its  inputs.  For  example,  the  meet  of: 
die  suspend  animate(x) 
die  suspend  animateCy) 
die  abdicate 

is  die  suspend.  The  effect  is  to  lake  the  longest  common  prefix  of  incomparable  sequences,  and  keep  the  best, 
according  to  our  priorities,  of  the  sequences  that  remain.  Since  HPC  apf^ics  one  basic  policy  at  a  time  until  it  gets 
to  one  that  works,  the  inicrprcution  of  incompar^le  policies  is  that  the  conflicting  portions  don’t  work. 

There  arc  other  rtconciUauon  strategies  worth  considerabon.  The  basic  policies  can  be  ordered  differently, 
i^ccording  to  difTcrent  priorities,  and  lexicographic  ordering  is  not  the  only  way  to  compare  sequences.  For 
example,  i^one  of  the  basic  policies  in  a  sequence  up  to  the  last  (first  applied)  iton>uimatc  will  ever  be  cried 
(because  a  rK>n*aiiimitc  policy  never  fails),  so  this  prefix  could  be  discarded  as  tnelevant 

€22.  ConoecHoiis 

The  ocmaa.oi  rcUiion  can  not  be  modelled  as  the  intersection  of  a  characterisuc  with  an  immutable  relation, 
because  connecuons  arc  created  and  destroyed  without  creating  and  destroying  the  views  they  join.  A  view  may 
have  at  most  one  connecuon.  and  this  constraint  can  obviously  be  violated  upon  merge.  HPC  cannot  resolve  the 
inconsistency  without  making  an  arbisriry  choice  of  connection(s)  to  remove  arid  this  woukS  violate  the  bask 
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principle  of  retaining  all  structure  from  all  paniiions,  so  the  mcct/laiiice  technique  was  not  used. 

InsiexiCi.  the  HPC  system  reports  this  inconsi.stency  to  the  agenLs  of  Uie  domain  to  wliich  a  multiply  connected 
view  Ixilongs.  L’ndefined  behavior  is  avoided  by  suspending  the  view,  and  all  communication  pailis  passing 
through  ii.  until  a  legal  number  of  connccilons  is  restored.  AgenLs  can  use  the  disconnect  primitive  to  remove  c.xtra 
connections  a.s  casil>  a<  legal  ones,  so  the  tools  needed  for  user  reconciliation  already  exist. 

There  arc  two  implications  for  system  design.  First,  the  format  of  public  peer  notifications  must  allo\s 
arbitrar}'  numbers  of  i>eers  even  though  at  most  one  is  normally  permitted.  Second,  the  connect  and  disconnect 
primitives  cannot  be  exact  inverses,  because  the  preconditions  for  disconnect  must  be  more  general  than  the 
postconditions  that  connect  can  establish.  This  asymmetry’  could  be  removed  by  allowing  connect  to  create  more 
than  one  connection  to  a  viev. .  automatically  suspending  it. 

6.2.3.  Space  Hierarchy 

The  pnper  nesting  of  objects  is  the  most  difhculi  struciural  invariant  to  restore  during  a  merge,  and  the  dual 
graph  representation  is  essential  when  expressing  the  possible  inconsistencies  and  their  reconciliations.  Since 
enclose  and  disclose  modify  the  dual  graph  without  creating  and  destroying  all  the  views  of  the  affected  spaces,  a 
representation  for  spaces  is  a  fundamentalh  mutable  relation.  Two  formal  relations  define  the  dual  graph; 
baurdtvi,  v2),  which  describes  the  immutable  edges  (shells  and  splices),  and  a  mutable  relation,  *d>0Bnt{vi.  v2), 
which  indirectly  describes  spaces  by  their  incident  views.  The  pairs  of  views  incident  on  a  common  space  satisfy 
ac3*otr.i.  SO  ihc  Complete  set  of  views  incident  on  a  space  is  an  equivalence  class  of  the  relation.  This  indirect 
description  is  technically  more  convenient  than  the  obvious  relation  between  views  and  explicit  spaces 

Lcioerr.  (v,  s) . 

Merge  can  transform  a  set  of  strict  trees  into  an  arbiLTuy  directed  graph.  This  is  the  primary’  difficulty  in 
reconciling  the  object  hierarchy.  Figure  6.1  shows  the  merger  of  a  partition  with  three  nested  shells.  A,  S,  and  C. 
and  a  partition  in  which  B  has  been  disclosed.  (The  arrows  shown  on  directed  edges  point  away  from  the  fool) 


Figure  6  1.  Loop  in  the  Dual  Graph 

Figure  6.2  shows  two  paniiions  in  which  encl<»e<B)  has  been  executed  on  a  common  initial  strucuirc  of  nes^ 
shells,  A  and  S.  and  their  merger. 


Figure  6.2.  Parallel  Edges  in  the  Dual  Graph 


More  complicaLcd  examples  wiih  larger  loops  and  incomparable  branches  can  easily  be  consirucied. 

The  vioiaiions  of  ihe  sirici  hierarchy  can  be  removed  by  converting  some  of  the  directed,  non-tree  edges  into 
undirected  splices.  For  example,  the  simple  loop  of  Figure  6.1  can  be  convened  as  shown  in  Figure  6.3. 


Figure  6.3.  Breaking  Loops  through  Eversion 

Tne  hierarchical  view  of  the  pre-merge  and  posKonversion  structures  (Figure  6.4)  shows  that  shell  B  is  effectively 
turned  inside  out.  with  one  end  of  the  opaque  splice  (Bl)  representing  the  exterior  of  B,  and  the  other  end  {B2) 
represenung  its  inicnor.  Because  of  this  effect,  we  call  the  conversion  technique  eversion. 


Figure  6-4.  Hierarchical  View  of  Eversiofi 

We  can  now  describe  the  rcconciluiuon  of  the  hierarchy  within  a  single  don^-ain  under  the  assumption  that  the 
domain  has  a  unique  root.  The  pfoccdurc  is  as  follows 

(I)  Find  (iransparcni.  directed)  shells  that  have  been  convened  into  (opaque,  undirected)  splice  Fimun  in  some 
paniuon.  and  even  them  m  ihc  nverger 
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(2)  Take  the  transitive  closure  of  the  union  of  the  local  aijaae.’it  relations.  In  general  this  will  merge  spaces  and 
give  spaces  multiple  parents. 

(3)  Compute  immediate  dominators  and  back-edges  in  a  single  pass  over  the  dual  graph.  Break  all  loops  by 
everting  back-edge  shells. 

(4)  If  a  space  has  more  than  one  incoming  shell,  even  them  all,  and  merge  the  space  with  its  immediate 
dominaior. 

Spaces  with  incident  edges  vi  and  v2  are  merged  by  adding  M,  v2)  and  (v2,  vd  to  adjacent  and  taking  the 
transitive  closure  of  the  relation.  This  closure  step  may  be  performed  once,  after  all  spaces  have  been  processed. 

Any  number  of  partitions  can  be  merged  in  any  ordei,  and  the  eversions  and  space  merges  during  a  partition 
merge  can  be  performed  in  any  order  and  still  yield  a  unique  result  This  procedure  hides  all  no.'i-liierarchical 
structure  behind  splices.  Equally  important,  it  preserves  all  compositions,  and  therefore  all  behavior.  It  does  so  by 
modifjing  the  hijrarchy  in  ways  that  clients  can  not  so  this  automatic  reconciliation  is  not  transparent.  Unlike 
renaming,  eversion  can  not  be  disguised  as  asynchronous  behavior  by  another  client  agent 

The  root  domain,  which  consists  of  one  space  with  incident  edges  for  the  top-level  objects,  requires  slightly 
different  treatment.  Because  there  is  no  root  shell,  and  because  the  top-level  objects  in  different  partidons  might  be 
disjoint,  the  root  space  may  be  represented  by  several  ec(ui valence  classes  of  adjAcant  instead  of  one.  The  problem  is 
avoided  simply  by  distinguishing  the  root  space. 

The  I'rst  step  given  above,  everting  shells  that  have  already  been  everted  in  a  previous  merge,  could  be 
omitted  and  still  result  in  a  consistent  object  hierarchy.  However,  the  reconciliation  procedure  would  no  longer  give 
the  same  result  for  merges  of  partitions  in  different  orders.  As  with  splicing,  eversion  offers  a  choice  of  creating 
new  splicses  every  time  any  shell  must  be  converted,  or  i  single  splice  for.  a  given  shell,  no  matter  how  many  dmes 
the  shell  was  converted  in  previous  merges.  We  chose  to  create  a  single  splice  to  reduce  the  amount  of  structure 
created  during  merge,  and  use  a  pairing  function  to  unambiguously  relate  a  shell  and  its  post-eversion  splice. 

Earlier  reports  described  a  mechanism  for  reporting  and  user  reconciliation  of  inconsistencies  in  the  object 
hierarchy  (deuch)  (l^FSS),  [LeFSS),  but  the  full  implications  on  HPC  system  design  were  not  understood  at  that 
time.  Both  system  and  agent  complexity  are  reduced  by  automatic  recor^ciliation  of  the  directed  cre^\ 

6.2.4.  Domain  Hierarchy 

Now  consider  domains  with  multiple  superior  domains.  There  may  be  no  unique  root  within  the  domain  to 
base  intra-domain  reconciliation  on.  and  opertfions  like  abdicate/depose  and  die/kill  ait  no  longer  well-defined, 
because  there  is  no  unique  superior  domain  boundary  or  well-defined  subtree  to  remove.  Panition  and  merge  can 
easily  produce  such  situations.  Starting  with  two  pet-partition  domains  abow*  and  faBiaw,  during  a  petition  an  agent 
for  above  can  invest  control  of  the  subtree  containing  baiow  lo  a  new  domain.  nAOdim.  In  one  paitition.  baiow's  superior 
domain  is  abov«.  in  the  other  it  is  mkuit.  More  complex  sequences  of  domain  and  space  operations  can  lead  to 
structures  as  shown  in  Figure  6.5.  (Domain  boundaries  are  shown  with  double  lines.) 


Figure  6.5.  Multiple  Superior  Domains 


Each  domain  is  prepared  for  intra-domain  reconciliation  by  merging  all  the  spaces  with  an  incident  superior 
domain  boundar>’  before  reconciling  the  domain  Litemally.  This  ensures  a  unique  root  space  within  the  domain, 
makes  all  the  superior  domain  boundaries  incident  to  the  same  space,  and  brings  each  of  the  part  doned  domain 
roots  up  to  the  root  of  the  merged  domain.  Figure  6.6  shows  this  *tep  applied  to  the  structure  of  Figure  6.5. 


Figure  6.6.  Superiors  Isolated  at  the  Root 

The  remaining  inconsistency  of  multiple  superiors  to  a  domain  is  treated  similarly  as  multiple  connections  to  a 
view.  The;e  is  no  functicn-preserving  transformation  that  can  be  automatically  applied,  so  the  multiple  superior 
boundary  views  arc  reported  to  the  agent  of  the  inferior  domain,  and  forced  si...pended.  Because  only  the  superior 
domain  boundaries  arc  suspended,  the  domain  agent  process(es)  are  able  to  coordinate  and  execute  a  response  (but 
see  below.) 

To  allow  clients  ic  reconcile  this  kind  of  inconsisienc)'.  the  definitions  of  abdicitc/depose  and  dk/kill  are 
extended  to  excess  superior  domain  bouruiahes.  Inside  the  domain,  an  excess  domain  boundary  is  simply  removed. 
Outside  the  domain,  the  domain  is  replaced  by  an  empty  leaf.  When  only  one  superior  domain  boundary  remains,  it 


86 


is  unsuspended  and  the  domain  operations  will  have  their  usual  effects.  This  is  externally  consistent  with  the 
behavior  of  depose  on  a  domain  suspended  because  its  agents  are  temporarily  partitioned.  An  implication  is  that 
kill  on  a  given  shell  only  destroys  the  structure  it  dominates.  For  DAGs  rather  than  trees,  this  may  be  strictly  less 
than  the  structure  it  precedes  in  the  hierarchy. 

Terminal  policies  of  abdicate  and  die  are  extended  to  excess  superior  domain  boundaries  simply  by  applying 
them  to  all  superior  boundanes  rather  than  the  usual  unique  boundary  that  is  implicit  in  the  policy.  All  boundaries 
will  be  treated  as  excess  and  the  domain  will  be  effectively  killed. 

While  this  strategy  leads  to  DAGs,  rather  than  u-ces  of  domains,  the  graph  of  domains  still  behaves  like  a  tree. 
Domains  can  interact  through  communication  or  domain  operations  only  when  the  subgraph  relating  them  is  a  tree. 

As  a  simplification  of  the  HPC  interface  to  processes,  simple  domains  are  treated  specially  to  ensure  a  single 
superior  domain.  Because  processes  are  physical  resources,  they  belong  to  a  single,  well-defined  partition  (unlike 
abstract  complex  objects).  If  a  merge  gives  a  process  multiple  superior  domains,  the  boundary  known  in  the 
partition  of  the  physical  process  is  retained,  and  the  boundaries  known  in  other  partitions  are  replaced  with  empty 
leaves.  During  the  partition,  these  other  boundaries  will  be  suspended,  representing  uncertainty  about  the  process’s 
continued  activit)',  so  this  treatment,  like  renaming,  is  masked  as  asynchronous  behavior  of  other  agents. 

Because  structure  is  reported  to  clients  strictly  in  the  hierarchical  view,  the  format  for  reporting  a  shell’s 
parent  must  allow  arbitrar>’  numbers  of  parents  even  though  only  immediate  children  of  the  superior  domain 
boundaries  will  have  multiple  parents.  Similarly  to  the  case  of  multiple  connections,  inverse  domain  operations  are 
not  exact  inverses,  but  the  asymmetry  can  not  be  as  easily  removed  in  this  case  because  there  is  no  operation  to 
create  a  directed  domain  boundary  between  arbitrary'  domains. 

There  is  a  further,  subtle,  interaction  with  protection  structure  and  the  system  interface.  It  is  possible,  if 
unlikely,  for  all  paths  to  the  domain's  controller  to  pass  through  one  or  more  of  the  superior  domain  boundaries, 
leading  to  a  temporary’  loss  of  control  when  the  paths  are  suspended.  In  this  situation,  an  external  agent  can  not 
restore  control  directly,  it  can  only  give  up  control  by  destroying  one  of  the  surplus  domain  boundaries.  When  all 
but  one  of  boundaries  have  been  destroyed,  control  may  be  restored  via  paths  through  the  remaining  boundary. 
Because  the  (possibly  many)  external  agents  have  no  way  to  coordinate  their  actions,  restoration  of  control  is 
unlikely.  Even  "generous'*  agents  willing  to  yield  control  can  get  into  trouble  because  of  the  asynchrony  in  the 
system  interface.  An  agent  could  destroy  the  final  boundary  before  learning  the  suspension  had  been  lifted. 

We  declined  the  aggressive  investigation  of  more  genenJ  structures  that  this  interaction  suggests.  We  could 
accept  multiple  superior  domain  boundaries  as  well-fonned  structure,  but  would  prefer  an  explicit  operation  lo 
create  such  boundaries  instead  of  making  it  an  exclusive  side-effect  of  uncontrollable  partition  and  merge. 
Alternatively,  the  hierarchical  relation  between  domains  cruld  be  done  away  with,  ao  that  all  domain  boundaries  are 
undirected  splices,  but  the  problem  of  destroying  antther  domain,  and  any  auxiliary  domains  it  created,  remains. 
Authorization  to  destroy  a  domain,  without  knowing  or  controlling  its  internal  stiuaure.  must  be  expressed 
somehow,  if  not  hiervchically. 


Prototype  Implementation 
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7.  Prototype  Implementation 

At  a  distance,  the  HPC  system  is  a  three  layer  cake.  At  the  bottom  is  a  collection  of  host  operating  systems,  at 
the  top  is  a  collection  of  client  processes,  and  in  between  is  HPC  software.  The  middle  layer  consists  of  three  types 
of  processes:  kernel,  host  IPC  router,  and  host  process  manager.  The  kernel  maintains  the  database  of  abstract 
structure,  and  determines  the  resources  needed  to  implement  an  abstract  operation. 

The  client-HPC  interface  provides  agent  processes  access  to  abstract  structure  like  shells  and  interfaces,  and 
worker  processes  the  host  IPC  resources  needed  for  end-to-end  communication.  Clients  interact  with  the  HPC 
kernel  using  a  complex  application  protocol  on  top  of  standard  network  connections  (TCP/IP).  (Section  7.1.) 

The  host-HPC  interface  deals  with  physical  resources  like  IPC  media  and  processes.  The  router  and  manager 
processes  isolate  host  dependencies  of  resource  creation,  destruction,  and  status  monitoring  from  the  rest  of  the 
software.  They  are  the  only  system  components  to  communicate  directly  with  hosts.  Sections  7.2  and  7.3  discuss 
their  internal  structures  and  interactions  with  the  kernel. 

The  kernel  process  is  the  heart  of  the  system.  Section  7.4  desciibcs  the  basic  software  packages  that  handle 
interactions  with  the  outside  world  and  maintain  the  internal  database  of  abstract  structure.  Incremental  updating  of 
global  connectivity  as  a  result  of  local  abstract  operations  is  an  important  function  of  the  database. 

Tlie  final  Section  presents  some  experiences  with  essential,  desirable,  or  inadequate  tools  and  programming 

support. 

7.1.  Client  Interface 

The  client-kernel  interface  has  three  functions:  integrate  new-  clients  into  the  HPC  system,  implement  the 
interactions  between  agent  clients  and  (he  HPC  kernel,  and  provide  worker  clients  with  the  real  IPC  capabilities 
represented  by  abstract  endpoints.  These  functions  are  summarized  here.  For  a  full  description  of  (he  C 
language/UNIX  operating  system  interface  binding  and  the  underlying  network  ippbcaiion  protocol,  see  fFriSfi). 

7.1.1.  Registration 

Client  processes  are  iniegrated  into  the  HPC  system  by  two  TCP/IP  connections.  Clients  connect  to  a  well- 
known  kernel  pon  to  rtgister  with  the  kernel.  The  first  step  of  the  registratiem  protocol  creates  the  second 
connection. 

The  client  sends  its  host  and  process  numben  to  the  kernel,  which  determines  if  the  client  luis  been  iBimated. 
If  not,  the  kernel  creates  a  new  shell  with  no  tnierfaces  e  an  immediaie  child  of  the  HPC  rooc  This  is  how  new 
independent  applications  enter  the  system.  After  locating  or  creating  the  appropriate  shell,  the  kernel  seixis  its 
interface  descriptions  to  the  client. 

!f  (he  client  is  implementing  a  complex  domain,  it  describes  Che  interfaces  it  wants  on  the  new  shell  to  be 
created,  and  specifies  the  interface  to  be  connected  to  the  new  cocuroUer.  The  keroel  acknowledges.  For  both 
simple  and  complex  domains,  the  client  sends  an  explicit  end  to  the  registiatioii  proiocot. 

7.1Jt.  Agents 

inierr  jiions  bctw'een  agents  and  HPC  use  both  client-kernel  ooniiections.  One  oonnociion  &  used  in  a 
synchronous,  bidirectional  protocol.  For  every  invocation  of  an  HPC  primitive,  the  clicia  stmds  a  rigidly  formatted 
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message.  The  HPC  kernel  responds  immediately  with  a  mcssLce  contairung  a  globally  unique  "requesr  number 
that  will  be  used  later  to  refer  to  the  invocation. 

The  kernel  uses  the  second  connection  asynchronously  and  unidirectionally  to  send  the  client  three  types  of 
information  messages:  invocauon  error  reports,  notifications  of  structural  change,  and  responses  to  the  inquire 
operation.  Every  message  follows  a  rigid  format,  and  begins  with  the  request  number  for  the  invocation  that 
ultimately  prompted  it.  For  messages  triggered  by  events  outside  the  HPC  system  (e.g.,  failures)  the  request  rumber 
is  a  distinguished  value. 

Every-  time  a  structural  element  is  created,  destroyed,  or  modified,  a  notification  is  sent  to  all  agents  for  the 
affected  domain.  This  is  the  response  to  successful  invocations  of  HPC  primitives.  One  invocation  may  generate 
many  notifications  carrying  the  same  tag.  These  r  jtifications  arc  copious  and  brief  to  eliminate  the  need  for  polling 
by  agents. 

•  An  enclose  operation  generates  a  creation  message  for  the  new  shell,  creation  messages  for  every  one  of  its 
interface  view  s,  change  messages  for  its  children,  and  a  change  message  for  its  parent.  A  disclose  operation 
generates  a  deletion  message  for  the  shell,  deletion  messages  for  every  one  of  its  interface  views,  change 
messages  for  its  children,  and  a  change  message  for  its  parent 

•  When  a  new  interface  component  is  aeated  a  creation  message  for  the  new  view,  and  a  change  message  for 
its  parent  are  created.  If  the  new  view  is  complex  (e.g.,  a  bundle)  creation  messages  are  generated  for  all  its 
components. 

•  When  livencss  or  connectivity  changes  on  a  view,  a  change  message  is  generated. 

•  When  a  shell  becomes  a  domain  boundary,  deletion  messages  for  all  the  previously  visible  structure  are 
generated.  When  a  domain  boundary  is  dissolved,  creation  messages  for  all  the  previously  invisible  stnicture 
are  generated.  In  both  cases,  a  change  message  for  the  sltell  is  generated. 

7.U.  IPC  Terminals 

Real  communication  benveen  worker  processes  requires  translation  of  abstract  simple  endpoints  inside  simple 
domains  into  host-specific  I/O  facilities  accessible  from  the  worker  processes.  Worker  processes  then  use  host  I/O 
operations  to  communicate  with  one  another. 

An  IPC  iermtnal  is  a  host  I/O  handle  (e.g.,  UNIX  file  descriptor)  together  with  any  additional  resources 
needed  for  the  HPC  system  to  connect  and  disconnect  woiiers.  Before  a  client  can  use  an  endpoint,  it  must  be 
tiinslated  into  a  terminal.  Clients  must  expticttJy  begin  translation  to  avoid  utmeoessaiy  consumption  of  per-bost 
and  per-process  resources.  Clients  can  reclaim  terminal  resources  by  destroying  the  terminal. 

Creation  and  destruction  of  a  terminal  has  no  effect  cn  the  abstract  endpoint  except  that  liveness  reflects 
connectivity  to  real  terminals.  Roughly,  terminal'  are  to  endpouiis  as  physical  pages  are  to  vtnual  pages,  except  that 
clients  must  do  their  own  resource  management 

To  associate  the  I/O  handle  with  the  endpoint  and  to  create  any  other  needed  itsouroes.  the  client  and  kerm! 
use  a  synchronous  protocol  over  the  first  connection.  In  this  !mplemeniatic*n,  terminal  cfcation  requires  an 
of  five  messages.  This  is  only  pan  of  the  protocols  needed  for  terminal  manipulates.  The  kernel  is  engaging  in  a 
similar,  but  more  complex,  protocol  with  the  IPC  router  process  at  the  same  time. 
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The  diem  interface  librar>'  hides  the  terminal  protocols,  just  as  it  hides  the  invocation  protocol,  and  so 
prevents  client  programming  errors. 

7.1.4.  Discrepancies 

The  client-kernel  interface  was  frozen  early  in  the  development  of  HPC,  leading  to  some  discrepancies 
between  design  and  implementation.  The  most  glaring  discrepancy  involves  control  messi  ges  between  agents  and 
controllers.  As  designed,  an  agent  process  wishing  to  invoke  some  operation  sends  a  *>.•  .  jnJroller  of 

domain  to  be  affected,  using  a  control  interface.  The  usual  rules  about  structural  compatibility  and  ei^d-tc 
connections  apply.  A  controller's  multicast  interface  ensures  that  all  agents  connected  to  a  controller  receive  its 
notifications  and  that  the  controller  receives  messages  from  all  agents. 

A.C  currently  implemented,  an  agent  process  always  invokes  HPC  operations,  on  any  domain  for  which  it  has 
privileges,  through  he  procedures  provided  in  the  application  interface  library.  The  agent  does  not  send  messages 
to  connected  controllers.  However,  it  must  have  connections  to  the  appropriate  controllers.  An  agent's  connectiqns 
determine  its  privileges,  just  as  if  control  messages  were  being  sent.  If  an  end-to-end  connection  to  the  appropriate 
controller  exists,  the  kernel  carries  out  the  operation. 

The  interface  library  sends  messages  directly  to  the  HPC  kernel  using  the  invocation  protocol,  so  control 
messages  can  not  be  intercepted,  filtered  or  debugged  as  described  in  Chapter  3.  Also,  agents  can  not  sepaxiie 
control  activities  for  separate  domains  onto  separate  local  interfaces.  Sending  invocations  along  connections 
primarily  associated  v/ith  an  agent,  rather  than  a  domain,  also  led  to  a  distortion  inside  the  HPC  kernel,  that  will  be 
discussed  later. 

In  this  centralized  implementation,  there  will  never  be  merge  inconsistencies.  Since  no  domain  will  have 
shells  to  multiple  parents,  the  arguments  to  abdicate  and  die  are  implicit 

The  kernel  administers  both  temporary  and  permanent  policies,  but  the  interface  only  allows  a  request  for  a 
new  permanent  policy  consisting  of  a  single  basic  policy  (die  or  abdicate).  Clients  can  not  request  alternatives  lo 
the  default  temporary  policy. 

Endpoint/extension  promotion  is  the  final  omission.  Worker  clients  are  allowed  lo  translate  either  endpoints 
or  extensions  of  simple  structure  into  IPC  tenninals,  so  piorootion  is  implicitly  allowed  for  simple  domains. 
However,  there  is  no  provision  for  promotion  in  cmnplex  domains.  The  details  of  managing  a  muld-agent 
irueraction  that  must  take  pbee  immediately  after  a  new  domain  is  ataied,  before  any  other  operations  are  invoked, 
and  affea  every  view  at  most  once,  were  never  worked  out 

7.2.  IPC  Router 

To  translate  abstract  connectivity  into  real  transport  connecuons,  the  HPC  system  most  provide  clients  access 
to  liost  IPC  resources,  and  dynimicilly  reconfigure  the  crinspcxt  connections  between  diems.  Unfommaidy.  most 
systems  and  standard  protocol  suites  lack  OUrd-parry  co>wa,  a  simple  (adlity  that  would  make  reconfignratioa 
a  trivia!  maticr. 

Lacking  third-pany  connect,  dynamic  reconfiguration  is  complex  and  expensive  enough  to  justify  tsolaiing 
host'spedfic  IPC  frincuor4S  in  a  separate  IPC  router  process.  The  router  supports  creation  and  destruction  of  client 
terminals,  creation  and  desuuction  of  end-to-end  connections,  and  additional  diagnostic  functions.  A  TC?/0^ 
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connection  is  used  with  an  asyn:hronous,  highly  multiplexed,  application  protocol  for  command  functions  between 
il»c  kernel  and  the  rouicr. 

7.2.1.  Third  Party  Connect 

Ideally,  the  HPC  kernel  could  look  up  the  IPC  tenninals  for  the  endpoints  of  an  end-to-end  chain,  and  instruct 
the  host  operating  system  to  set  up  or  tear  down  a  transport  connection  between  the  terminals.  The  kernel  would 
control  reconfiguration,  connecting  and  disconnecting  arbitrary  pairs  of  processes  (and  terminals),  without  the 
panicipafion,  or  even  cewperation,  of  the  affected  clients.  This  scenario  preserves  the  dk.  nciion  bS&tween 
communication  (workers’  responsibility),  configuration  (agents’  responsibility),  and  implementation  (kernel’s 
responsibility). 

Real  operating  systems  and  protocol  suites  place  unfonunate  and.  we  argue,  unnecessary  constraints  on  the 
creator  of  the  transport  connection  and  the  connected  processes,  leading  to  intrusive,  unsafe,  or  inefficient  (or  all 
three)  emulations  of  the  desired  third-party  connect  propeny. 

For  example,  a  UNIX  pipe  must  be  created  before  the  processes  it  connects,  by  a  process  that  creates  the 
connected  processes.  This  does  not  allow  reconfiguration  of  existing  processes,  and  is  totally  in?de<:|uate  to  support 
HPC. 

Most  operating  systems  with  more  sophisticated  IPC  objects  (Charlotte  links.  Aocent/Mach  ports,  4.3BSD 
pipes,  etc.)  permit  passing  a  link  end  along  an  existing  link.  However,  client'^  must  actively  participate  in 
reconfiguration,  by  continually  monitoring  a  link  to  the  HPC  kernel  for  messages  containing  new  links,  discarding 
old  leminals  for  a  given  endpoint,  and  installing  new  links  as  tenninals.  There  is  no  way  to  enfo.’ce,  or  even 
inspect,  that  clients  act  correctly.  They  could  retain  old  links  (in  systems  where  the  kernel  can  no:  retain  dest  oy 
rights),  ignore  new  links,  or  send  links  to  other  clients,  bypassing  agents  and  the  kernel  entirely. 

Standard  network  protocol  suites  iuch  as  O’  provide  even  less  support  for  transparent  reconfiguration  because 
the  kernel  must  give  a  client  process  full  details  about  the  address  and  identity  of  its  peer  before  a  connection  can  be 
created. 

Tnird-pany  connect  is  intrinsically  inexpensive.  For  example.  TCP/IP  uses  a  well-defined  sub-protocol  for 
connection  set  up  and  tear  down,  with  clean  interaciions  with  the  main  protocol  governing  reliable  delivery  and  fiow 
control.  If  a  third  process  could  initiaie  this  sub-procoool  across  the  network,  instead  of  the  'wo  communicating 
processes  from  the  host  tniMace.  we  would  have  third  party  connea. 

A  (nnctpal  reason  ihjii  link  lysieras  and  aecworit  protocols  icstiia  oonfiguraiion  as  diey  do  seems  to  be 
authentication  of  authorized  configuring  agents,  aliernatively,  equation  of  protection  with  bolding  a  link  or 
connection  (capabilities).  However,  even  the  absence  of  a  misled  global  mtsbeniicaiion  system  does  not  sand  in  the 
way  of  a  practical  third-(ttny  connect  mechanic.  To  corMinuc  the  TCP/IP  discussion,  each  diem  could  qpedfy  the 
host  and  port  from  which  authorized  oonfigumion  mess^es  are  allowed.  This  provides  third-party  connect  with  the 
exactly  tame  degree  of  security  as  c:  unmodified  TCP/IP.  We  diicoss  ihd  iiiue  further  in  n^nST). 

7.2.2.  Forwarding 

In  the  absence  of  third-party  connect.  «id  a  notable  (two  and  a  half  year)  delay  from  our  workstation  vendor 
in  provuhni  sufficient  sources  to  build  in  the  dcsirod  support,  we  chose  to  provide  an  imintrusivt.  socitpe,  but 
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incfricieni  emulation  for  the  UDP  and  TCP  protocols.  Wlien  a  client  IPC  terminal  is  created,  a  fixed  connection  is 
made  to  a  similar  terminal  ONvncd  by  the  IPC  router  process.  Reconfiguration  is  performed  entirely  inside  the  IPC 
router.  Clients  are  never  requested  to  modify  their  terminals  in  any  way,  nor  are  they  given  any  information  abou: 
their  pccri>. 

Dunng  tile  client-kernel  protocol  that  translates  an  endpoint  into  a  terminal,  a  similar  router-kernel  protocol  is 
executed  to  create  a  router  terminal  and  connect  it  to  the  client.  The  primary  difference  is  that  the  router-kernel 
setup  sub-protocol  may  be  multiplexed  among  concurrent  exchanges  between  the  router  and  kernel. 

After  terminal  creation,  the  router  keeps  track  of  which  host  terminals  correspond  to  which  endpoints.  When 
end-to-end  connections  are  created  and  destroyed,  the  kernel  instructs  the  router  to  start  and  stop  forwarding 
messages  between  the  endpoints.  Therefore,  connected  workers  do  not  send  data  directly  to  one  another,  but  first  to 
the  router  process,  then  on  to  the  destination. 

Besides  the  obvious  inefficiencies  of  sending  every  message  twice,  the  router  is  a  bottleneck.  Even  in  this 
prototype.  HPC  system  and  client  processes  can  be  freely  distributed.  Messages  that  might  be  processed  in  true 
concurrency  are  serialized  in  the  router.  Executing  router  functions  in  a  separate  process,  and  not  inside  the  kernel 
process,  adds  unwanted  latency  to  handling  both  agent  invocations  and  worker  communications.  Providing  one 
router  for  each  physical  processor  (HPC  does  not  assume  a  single  router)  might  avoid  the  communication 
bonleneck.  but  further  problems  arise:  finding  the  router  for  an  endpoint,  potential  router-to<  router  forwarding 
inefficiencies,  etc. 

The  IPC  router  setup  does  have  some  advantages,  of  course.  Besides  client  simpliettv  and  security,  it  offers 
an  obvious  experimental  implementation  for  multicasting  media  such  as  TCP.  which  don't  have  native  multicast 
semantics.  For  each  terminal,  the  router  keeps  a  list  of  all  the  other  terminals  to  receive  outgoing  copies  of 
incoming  messages.  For  a  normal  connection,  this  list  has  just  one  terminal.  (This  prototype  does  not  attempt  to 
handle  pathological  interactions  of  limited  buffering  and  reliable  retransmission.) 

123.  Implementation 

Because  IPC  is  the  fundamental  issue  in  real  interactions  between  objects,  the  router  and  the  router-kernel 
protocol  were  the  first  components  of  the  HPC  system  to  be  implemented.  They  were  fully,  and  easily,  debugged 
with  extensive  scaffolding  processes  standing  in  for  clients  and  the  kemel.  The  success  of  this  ^iproach  prompted 
similar  unii-and-interface  test  procedures  for  the  ocher  components.  Full  details  of  the  implementation  are  given  in 
[FrP861. 

The  router  is  a  very  highly  multiplexed  server,  and  care  was  taken  that  it  wottld  never  block  under  any 
circumstances.  For  each  output  tenmnal.  it  must  first  wait  for  available  operating  system  buffer  space,  then  wait  for 
input  on  the  corresponding  input  terminal,  read  the  input  without  Nocking,  write  the  output  widioot  Nocking, 
internally  buffer  any  excess  that  could  not  br  written,  and  deal  with  operating  system  errors  or  resource  Emtattons 
at  any  point  Of  course,  many  terminals  are  bidirectional  so.  this  is  done  in  both  dtrecsions.  snd  multicasting  furtha 
complicates  things. 

Ai  the  same  time,  the  router  is  engaging  in  overlapped,  noft-thvial  prorocols  with  the  kernel  to  handle 
command  functions,  it  must  be  ready  to  receive  a  new  message  from  the  kernel  at  any  time,  determine  which  in- 
progress  operation  it  applies  to,  and  advance  the  operation  the  oontspooding  step.  Unique  ags  are  used  in  the 
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rouicr-kcmcl  protocol,  as  in  ihc  clicni-kcmcl  protocol  lo  simplify  demultiplexing. 

The  router  is  based  on  a  table  driven  state  niachinc.  Each  terminal  has  an  entry  in  the  table  to  store  the  state 
of  Its  submachine.  An  incoming  event  on  a  terminal  recovers  the  previous  state  from  the  terminal,  then  calls  a 
function  to  do  the  next  step.  This  was  an  adequate  technique,  but  we  would  never  use  it  again.  The  lightweight 
tasking  package  (developed  after  the  router  was  finished)  would  eliminate  explicit  submachine  management,  and 
give  a  much  clearer  picture  of  "what  happens  next’  on  a  given  connection. 

13.  Process  .Vlanager 

The  process  manager  handles  host-specific  process  creation  and  destruction  for  the  HPC  kernel.  There  is  a 
separate  manager  for  each  physical  host.  As  with  the  IPC  router,  the  manager-kernel  interface  consists  of  a  TCPAP 
connection  and  an  a.<^ynchronous  command  protocol,  but  the  process  manager  is  a  much  simpler  piece  of  software. 

The  kernel  passes  the  arguments  from  an  animate  to  the  process  manager  for  the  specified  host  Inside  the 
manager,  a  lightweight  task  is  created  to  fork  and  exec  a  process  with  the  specified  arguments,  then  return  the  host 
process  identifier  to  the  kernel.  The  process  manager  also  monitors  the  processes  it  has  created  and  reports 
termination  to  the  kernel,  which  translates  process  death  into  impbcii  die  operations. 

By  design,  resources  like  files  and  devices  are  manipulated  outside  the  HPC  system.  However,  many 
operating  systems  protect  such  resources  by  using  acarss  control  lists  and  associating  some  user  identity  with  each 
process.  The  HPC  process  manager  creates  processes  with  an  user  identity  without  any  special  privileges,  and  the 
UNIX  sei-user-id  mechanism  can  be  used  to  associate  additional  privileges  with  a  specific  execuiible  program. 
This  interface  to  the  host  protection  system  is  not  strong  or  flexible  enough  to  protect  independent  HPC  applications 
from  one  another,  but  a  beaer  solution  is  a  maoer  for  *e-destgn  rather  than  implementation. 

7.4.  HPC  Kernel 

The  kernel  process  is  internally  organized  into  several  packages  of  software  with  sas  of  related  lightweight 
tasks.  Altogether,  there  arc  nine  disunci  packages  built  on  tens  of  supporting  libraries  not  specific  to  the  kemd. 
The  major  packages  be  roughly  divided  into  those  th*i  handle  iniefactions  with  the  world  outside  the  kerne) 
process,  and  those  that  handle  strictly  intemaJ  functions. 

The  client  package  handles  four  classes  of  cUem-kemei  interactio&s:  regisoniion.  lenninal  setup  invocation  of 
(Kimitives.  and  asynchronous  noeificaoons.  There  is  a  permanent  task  to  wait  for  new  cliott  connections,  a  emsient 
10  handle  the  body  of  the  itpstratioo  protocol,  and  a  task  to  hasKlie  synchronous  client  imernctijns. 

The  router  and  process  manager  packages  resemble  each  other,  wrh  one  penmnent  task  waiting  to 
(P^olfiplex  incoming  messtges.  a  small  database  of  tasks  waiting  for  messagts  with  specific  tags,  and  a  penaanem 
task  wii?ing  for  urisolicited  terminal  or  process  death  notificatiom  An  RPC-su^  module  conceals  the  details  of  the 
protocols  from  other  kemci  tasks.  The  router  package  dso  contains  s&dhlikc  routines  for  kerminil  semp  that  run  iIk 
cbem-ktroe)  and  router-kernel  subprotocols  conaintotiy  to  minimise  iaterscy.  (Such  rjutiocs  mt  obviously  unable 
to  use  the  RPCrlike  blocking  interiace  provided  for  ocher  cliem  or  roiiier  ftanctkms.) 

At  the  heart  of  the  kernel  is  s  stroaura^  d«ahay.  with  complex  layered  routines  to  access  the  database,  and  a 
USX  for  each  pending  invocation.  CUciu  tasks  spawn  theai  opersdon  tasks  in  response  lo  explicit  agent  invocations, 
and  the  tsrminaJ  and  proces.*:  death  tasks  spawn  them  in  response  to  foilures  or  uocoopemive  temufoiions. 
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This  gives  ccniral  responsibility  to  the  client  task,  since  most  invocations  are  siancd  there.  This  distorts  the 
kernel's  natural  structure,  because  all  HPC  pnmiiives  arc  fundamentally  applied  by  domain,  rather  than  by  agent 
clicni 

The  final  package  is  an  internal  controller  service.  There  is  a  permanent  task  that  waits  for  creation  of  new 
complex  domains,  and  creates  controller  tasks  to  handle  each  one  separately.  Controller  tasks  monitor  domains  for 
permanent  and  iemporar>  losses  of  control,  and  apply  the  aptxopriaie  policies.  In  a  better  implementation,  each 
controller  task  would  have  an  IPC  terminal  for  receipt  of  control  messages,  and  the  controller  tasks  would  spawn 
operation  tasks  instead  of  the  cbent  tasks. 

Figure  7.1  shows  the  overall  structure.  Circles  denote  tasks,  rectangles  represent  modules  shared  between 
tasks,  and  triangles  show  queues  on  which  internal  tasks  may  block.  Solid  lines  show  regular  calling  patterns  and 
dashed  Lines  show  task  spawning.  Despite  appearances,  there  ai  e  no  circular  dependencies  between  layers. 

7.4.1.  Database  Operations 

Each  operation  task  rails  a  single  entry  in  the  network  layer  of  the  structural  datahasf.  ^ftware,  which  is 
responsible  for  translation  between  external  protocol  and  internal  dau  rcprcscniabons.  A  network  function  reads 
and  decodes  the  body  of  a  protocol  message  into  convenient  internal  dau  structures,  verifies  that  all  purported  HPC 
identifiers  are  legidmaie,  and  passes  on  to  the  next  layer.  The  network  layer  reports  cm)rs  of  all  kinds  back  to  the 
invoking  agent. 

The  high-level  semantics  layer  completes  translabon  of  argumenu,  does  argument  validation,  and  requests 
any  necessary  serialization.  The  function  for  a  given  HPC  primitive  first  converts  HPC  unique  identifiers  into 
pointers  to  tl.e  appropiiaie  daubase  structures,  then  collects  any  structures  affected  by  the  operation  that  are  not 
explicitly  named  in  the  argument  list.  The  function  then  checks  the  domain  of  the  request,  the  types  and  dom^  of 
tlie  arguments,  and  the  structural  relations  among  the  arguments  against  the  preconditions  of  the  operation. 

IV  high-level  laser  brackets  calls  to  the  low-level  semantics  layer  with  calls  on  the  syncluonization  layer  to 
ensure  confiiciing  operations  do  not  overlap.  An  operation  will  be  blocke^^  until  no  possibly  conflicting  opembons 
are  active.  When  resumed,  it  must  check  its  ari^nimeni  list  for  changes  caused  by  such  conflici*.  possibly  coUocting 
a  new  set  of  implicit  arguments.  This  synchrocozabon  is  obviously  needed  in  a  decentralized  or  truly  concurrem 
kernel,  but  it  is  also  needed  even  in  this  logically  centralized,  non-preempeive  tasking  design-  The  intetfaces  to  host 
IPC  and  process  managers  allow  an  operation  task  to  block,  yielding  the  kenel  to  another  task  while  a  remote 
opriabon  compleics 

After  checking  and  serializing,  low4evel  semantic  loutmes  arc  called  to  do  die  dgntiWam  :a»tpiiiack>o  of 
abstract  sjuauics.  This  level  encodes  HPC  structural  aemamics.  calling  on  a  boaom-most  dat^iase  layer  fo  record 
abmet  structure,  and  the  IPC  and  process  maiu^packi^  to  realise  abstract  changes  in  physical  reaourctt.  The 
iMfk  handling  the  diem  icgistrabon  protooil  dirccUy  invokes  wnt  ublicy  funebons  in  this  layer  lo  citaie  new  shells 
and  controlkfs.  The  low-level  semanbe  layer  also  issues  all  souctural  chwge  nobficatinns  and  responses  to 
Inqulrts. 

TV  HPC  daubase  docs  not  threcUy  tmpicmcM  all  the  idaiions  like  in  the  formal  ipectficaiion  of 

HPC  stnicfure.  Wha*  is  degim.  or  at  least  stmple.  in  a  formal  setttng  may  be  absurd  in  a  compuubonal  aening.  For 
example,  when  incftmenuny  modifying  structure,  h  is  much  more  effictem  to  infer  adjacency  from  explicii  spaces 


^patlA.  Kernel  Sdiwtft  Ssttcutfc 


tn  the  dautosc  than  mfex  the  spaces  iwm  m  mip£tgicy  reUtioe. 

The  most  compJet  data  soructures  nid  il|Qmhms  m  the  Itancl  aft  m  the  to*4cvd  semamte  Uye/  to  haadfc 
p.'tth  maifti^Aance  Tliesc  fOcstmes  wit)  be  discvfsed  tn  desat!  {xei 
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The  final  database  layer  is  the  collection  of  basic  data  structures  and  their  access  routines.  It  provides  a  name 
registry  for  use  by  the  network  layer  to  translate  external  HPC  identifiers  into  internal  pointers.  There  arc  basic 
internal  data  types  for  spaces,  views,  shells,  domains,  and  processes,  though  only  shells,  views,  and  domains  can  be 
named  by  clients. 

Each  t>pc  has  operations  for  initializing  its  module,  and  for  creating,  validating,  and  freeing  structures.  Each 
i>pc  representation  carries  a  unique  password  value  '  i  its  first  location,  and  the  database  and  the  low-level  semantic 
layer  routines  rigorously  check  each  argument  for  pointer  and  password  validity.  Besides  these  generic  operations, 
each  type  has  specialized  operations  to  set  and  clear  fields,  the  most  important  being  pointers  linking  t./o  suucturcs 
to  each  other.  These  operations  ensure  that  both  links  are  made  and  broken  at  the  same  time,  that  links  to  structures 
are  never  lost  by  accidently  overwriting  them,  and  enforce  a  useful  discipline  in  the  low-level  semantic  routines. 
Figure  7.2  shows  the  most  significant  links  between  the  internal  data  structures. 


Figure  7.2.  Principal  Database  Links 

The  database  layer  uses  generic  data  structures  like  hash  tables  and  ordered  sets  throughout  for  fast  lookup 
and  arbitrary  numbers  of  similar  links. 

7.4  J.  Root  Domain  and  Controller  Service 

A  special  case  in  the  second  step  of  splicing  allows  promiscuous  services.  If  the  remote  shell  has  been  spliced 
to  a  well-kxK>wn  identifier,  then  a  sibling  of  it  is  created  and  spliced  to  the  local  shell.  This  single  well-known 
identifier  amounts  to  a  service  service,  because  shells  spliced  to  that  identifier  may  then  be  spliced  to  from  clients  at 
arbiuary  places  in  the  hierarchy  without  prior  negotiation,  once  their  identifiers  have  been  distributed.  It  was  more 
convenient  to  provide  services  this  way  than  implenten,  the  restricted  number  of  service  shells  described  earlier. 

The  distinguished  root  domain  is  treated  only  slightly  differently  from  client  domains.  The  low-!cvcl 
semantic  layer  delivers  structural  change  notifications  internally  to  a  permanent  "root  agent*  lightweight  task  instead 
of  externally  to  another  process.  The  root  agent  task  has  two  functions.  It  cleans  up  top-level  domains  that 
abdicate  by  killing  their  subtrees,  and  it  ures  the  service  service  mechanism  to  implement  the  internal  controller 


service. 
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During  start-up,  a  shell  is  created  inside  the  root  domain  and  spliced  to  the  service  service.  When  processing 
invest,  the  kernel  splices  the  conu-oller  shell  in  the  new  complex  domain  to  this  controller  service  shell.  The  root 
agent  task  creates  a  separate  controller  task  to  monitor  each  new  domain. 

The  controller  service  is  a  fairly  t>T)ical  service,  despite  its  implementation  insidv;  the  kernel.  A  task  watches 
over  the  overall  state  of  the  service,  clients  come  and  go,  tasks  are  created  to  service  them,  and  (in  a  proper 
implementation)  client  requests  are  translated  into  operations  on  an  internal  database.  The  only  reason  it  couldn't  be 
reimplemented  outside  the  kernel  (cf  Section  3.4)  is  that  liveness  on  the  controller  interface  is  insufficient  to 
distinguish  permanent  and  temporary  losses  of  control. 

7.43.  Path  Maintenance 

Both  the  client  interface  and  the  formal  specification  deal  directly  with  connections,  shells,  and  splices.  These 
define  local  connectivity  between  views.  Global,  end-to-end,  connectivity  between  objects  is  a  complex  function  of 
the  local  connectivity,  involving  indirect  bindings  introduced  by  abstraction  (corresponding  components)  and  by 
composition  (chains  of  alternating  public  and  private  peers). 

Deducing  global  connectivity  jfrom  the  local  connectivity  is  certainly  possible,  and  the  formal  specification  is 
written  that  way  for  simplicity.  However,  obtaining  deductive  closure  directly  from  axioms  of  direct  binding  is 
unsuitable  for  any  real  system.  The  efficient,  incremental  computation  of  global  connectivity  and  liveness  triggered 
by  op^  ations  on  direct  bindings  is  the  most  interesting  algorithm  in  the  kernel,  called  path  maintenance.  (The 
similamy  to  truth  maintenance  systems  in  implementations  of  formal  logic  is  deliberate.) 

Path  maintenance  keeps  track  of  direct  and  indirect  birJings  between  views.  The  indirect  bindings  retain 
enough  global  information  to  compute  the  effects  of  a  change  quickly,  itgardleas  of  the  strucuiral  distance  between 
the  cause  and  the  effect 

Each  view  data  structure  maintains  separate  lists  of  private  and  public  peer  bindings.  Private  and  public 
bindings  arc  distinct,  and  a  pair  of  views  may  be  bound  both  ways  simultaneously.  Only  direct  public  peers 
(connections)  are  used  in  path  maintenance.  However,  a  view's  list  of  private  peers  includes  views  at  aU  odd 
distances  down  chains.  These  cached  bindings  allow  propagation  of  changes  in  connectivity  and  liveness  directly  to 
distant  affected  views. 

Multicasting  allows  more  than  one  chain  between  two  views,  creating  a  given  indirect  binding  in  more  than 
one  way.  Path  maintenance  retains  the  proximate  just^catioru  for  each  binding  to  ensure  they  are  removed  at  the 
conta  time.  Direct  bindings  have  a  primitive  justification  as  a  connection  or  a  sbdl/iplice.  Indirect  bmdtngs 
between  corresponding  components  are  justified  by  the  binding  between  their  parents.  Indirect  bindings  between 
peers  along  a  chain  are  justified  by  the  alternating  private  and  public  bindings  in  the  chain. 

Bindings  and  justifications  fxm  a  directed  graph  where  the  sources  are  the  primitive  justifications.  At  odd 
distances  from  the  sources,  the  entries  are  bindings;  at  even  distances,  justifications. 
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Figure  7.3.  Binding-Justification  Graph 

Inferring  the  effects  of  a  local  change  from  the  direct  bindings  requires  the  minimum  amount  of  space,  but  an 
unreasonable  amount  of  time.  The  r^rresponding  binding/justification  graph  would  be  trivial,  with  ju.a  primitive 
justifications  and  direct  bindings,  and  every  view  along  a  chain  must  be  visited  on  every  related  operation. 
Recording  the  complete  set  of  direct  bindings  that  ultimately  justify  an  indirect  binding  represents  the  other  extreme, 
because  each  effect  can  be  looked  up  in  constant  time  using  a  binding^ustification  graph  four  layers  deep:  primitive 
justifications,  direct  bindings,  derived  justifications,  and  indirect  bindings. 

However.  dL-ect  lookup  is  expensive  in  space  and  has  substantial  hidden  costs  in  time.  The  cost  of  creating  or 
destroying  a  single  justification  grows  with  the  distance  between  the  bound  views,  because  links  to  a  greater  number 
of  justifying  bindings  must  be  maintained. 

Path  maintenance  compromises  between  lookup  and  inference  to  improve  performance.  It  deepens  the 
binding/justification  graph  by  allowing  indirect  bindings  to  justify  others,  while  reducing  the  fan-in  and  fan-out  of 
individual  bindings  and  justifications.  Because  the  graph  is  no  longer  bounded  in  depth,  inferring  an  effect  of 
changing  a  direct  binding  is  no  longer  i  constant  time  operation.  But  neither  is  it  necessary  to  visit  nodes  that  are 
unaffected  by  the  change. 

Bindings  between  corresponding  components  are  justified  by  the  binding  between  their  immediate  parents. 
Bindings  along  a  chain  are  justified  by  just  three  bindings.  For  a  binding  between  views  vi  and  V2,  one  direct 
connection  between,  say.  a  and  a.  and  the  private  bindings  (direct  or  indirect)  between  vi  and  ci  and  between  v2 
and  a  are  recorded.  This  justification  is  not  unique,  because  any  connection  between  v;  and  v2  could  be  chosen. 
The  order  in  which  bindings  are  added  to  a  chain  determines  the  specific  trio  used  to  justify  a  binding. 

This  compromise  has  several  notable  points. 

•  The  apparently  greater  cost  of  traversing  the  grapii  (inferring  indirea  effects)  is  acuiilly  of  (he  same  order  as 

the  cost  of  traversing  a  complete  list  of  prsoomputed  elTects.  Both  traversals  reach  the  same  views  and 

bindings. 
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•  There  lb  a  large  reduction  in  storage  requirements  for  links  between  bindings  and  justifications.  The  same 
number  of  bindings  exist,  but  ever)’  justification  now  has  a  bounded  number  of  con.stitucnt  bincings,  and 
bindings  for  views  closer  to  the  tops  of  their  local  hierarchies  participate  in  many  fewer  justifications.  (In 
exchange  for  this  global  reduction  in  links,  cycles  may  introduce  a  larger  number  of  justification  data 
structures.)  The  reduction  in  fan-in  and  fan-out  translates  into  increased  speed  when  manipulating 
justifications. 

•  The  binding/jusiifi cation  graph  is  not  a  DAG.  Cycles  allow  paths  between  peers  with  and  withou  complete 
pass  around  the  cycle.  A  binding  that  docs  not  depend  on  a  cycle  can  justify  itself  when  a  cycle  is  '  npleicd 
by  adding  a  connection.  Figure  7.5  illustrates  a  cycle  in  a  pan  of  the  binding/justificaiion  graph  rcsulung  from 
the  cyclic  path  shown  in  Figure  7.4.  Longer  path  cycles  can  create  longer  'tyclcs  in  the  binding/justificaiion 
graph. 


Figure  7.4.  Typical  Cyclic  Path 


BC  public 
bindino 


AB  private 
binciino 
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Figure  7.5.  A  Resulting  Self-Justifying  Binding 

The  kernel  translates  connect  and  disconnect  into  creation  and  destruction  of  public  bindings.  Similarly, 
client  shell  and  splice  manipulations  are  translated  into  creation  and  destruction  of  pnvate  bindings,  in  addition  to 
manipulaiions  on  spaces  and  view  hierarchies.  These  bindings  are  given  a  primitive  justification.  The  new  and 
delete  operations  affect  structure  within  a  view  hierarchy,  and  the  bindings  between  corresponding  omiponents  are 


indirectly  justified  by  their  immediate  parents. 


Binding  and  jusuficauon  removal  follow  a  simple  rule.  A  binding  is  removed  when  its  Ust  justification  is 
removed,  but  a  justification  is  removed  when  iny  of  iu  constituent  bindings  is  removed.  This  leads  to  a  recursive 
algorithm  for  destroying  connections  or  sbellsAplices.  Addition  of  bindings  and  jusUfications  naturally  obeys  the 
converse  rule,  but  is  more  complex  because  it  must  compute  any  addit^mal  bindings  justified  by  a  new  binding, 
while  the  destruction  algorithm  simply  looks  them  up  in  the  binding/justificaiion  Pseudocode  for  these 

algorithms  is  shown  in  Figures  7.6  and  7.7. 


In  Section  4.2  wc  noted  that  multiple  paths  between  views  shouM  not  lead  to  muhtple  delivery  of  messages. 
Any  pair  of  views  has  just  one  (private  and  public)  binding  in  the  daubase.  regardless  of  the  number  of 
justificaiions.  so  the  IPC  router  is  easily  instruct  to  provide  single  delivery.  When  a  bending  is  ptsdfied  a  hash 
tabic  is  searched  to  determine  quickly  if  it  already  exists. 


A  greater  potential  problem  is  the  infinite  number  of  chains  between  views  provided  by  zero  or  more  passes 
around  a  cycle.  An  algorithm  based  on  propagation  (or  kiference)  atong  a  chain  most  inebde  an  explicit  check  for 
cycling.  Path  mimtenince  avoids  this  by  adding  bindings  based  on  information  at  a  fixed  number  of  nodes,  and 
adding  justifications  only  when  a  new  path  b  creaicd.  When  a  cycle  is  completed,  a  single  justificttion  acoounu  for 
all  numt^rrs  of  posses  through  it. 
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Ordinarily,  cycles  in  the  binding/jusiificaiion  graph  would  never  be  removed,  because  ihc  underlying  strategy 
is  reference  counting.  However,  all  bindings  are  removed  correctly  without  expensive  checks  for  cycles.  First,  path 
maintenance  prohibits  justifications  in  which  a  binding  directly  justifies  itself.  This  prohibition  requires  checking  a 
binding  against  at  most  three  others  each  umc  a  new  justification  is  found. 

If  direct  private  bindings  could  be  removed  at  arbitrary-  times,  this  would  be  insufficient.  However,  dm  ‘ 
public  bindings  have  only  primitive  justifications  and  semantic  constraints  at  higher  levels  ensure  a  direct  privat'^ 
binding  will  never  be  destroyed  while  it  is  part  of  a  cycle.  Shells,  and  the  dire-ci  private  bindings  between  the  tops  of 
their  interface  hierarchies,  arc  destroyed  only  by  disclose  and  splice.  A  precondition  for  disclose  is  that  there  are  no 
connections  to  any  view  on  either  side  of  the  shell,  and  a  precondition  for  splice  is  that  there  are  no  connectiors  to 
any  view  on  the  lower  side  of  the  shell.  Induction  can  prove  that  the  loops  in  the  binding/jusiification  graph  are 
removed  before  it  is  legal  to  destroy  a  binding  that  indirectly  justifies  itself. 

Because  a  unique  path  between  two  views  is  justified  only  once,  the  non-unique  three-binding  path 
representations  are  permissible.  In  direct  lookup,  mentioned  above,  each  derived  justification  uniquely  represents  a 
non-cyclic  chain.  In  the  three-binding  compromise,  each  jusification  still  represents  a  single  chain,  but  a  chain  will 
have  muluple  representations.  These  multiple  representations  account  for  the  possible  growth  in  the  number  of 
iusnfication  data  structures.  However,  different  representations  are  used  to  justify  distinct  bindings,  never  to  justify 
a  single  binding  redundantly,  and  never  for  bindings  that  don’t  complete  a  cycle.  Moreover,  it  is  never  necessary  to 
search  for  a  path,  only  for  bindings. 

The  number  of  bindings  grows  as  the  square  of  the  length  of  a  chain,  because  path  maintenance  records 
indirect  bindings  between  every  pair  of  private  peers.  This  is  a  direct  and  expected  consequence  of  maintaining  an 
explicit  database  instead  of  performing  an  exponential  number  of  inferences.  Creating  of  a  single  binding  takes 
constant  time,  but  recursively  creates  N  additional  bindings,  whiere  is  the  length  of  the  chain  just  created.  For 
some  applications,  this  linear  cost  would  be  unattractive,  but  HPC  must  visit  all  the  views  along  the  new  chain  to 
report  changes  in  livcncss. 

The  primary  reason  for  bindings  between  all  pairs  of  views  along  a  chain  is  to  allow  destruction  of  bindings  in 
the  middle  of  the  chain  without  explicitly  traversing  iL  It  is  simple  to  bind  only  the  ends  of  a  chain  together,  and 
grow  chains  at  the  ends,  with  constant  time  creation  and  sub-linear  (even  negative)  growth  of  maintenance  data 
structures.  Hov^ever.  destroying  a  connection  requires  identifying  the  ends  of  all  chains  the  connection  justifies, 
either  directly  or  indirecUy  through  corresponding  components.  An  alternate  path  maintenance  algorithm  based  on 
direct  bindings  and  maximal  chains  is  worth  investigation,  but  proper  handling  of  cycles,  leflectors,  and  multiple 
paths  may  reduce  its  apparent  advanuges.  Anyway,  graph  **2^  has  not  been  a  problem  in  practice,  and  HPC 
requires  visitation  of  alt  nodes  for  other  purposes. 


102 


ciridi.’xr_crea*,e  (vl,  vv,  tN-pe,  ■} 
view  vl,  v2; 
in',  type; 

3-as*.ifica’:iQr.  y. 

create  b  »  new  bincinc (vl,  v2j 

aoQ  j  tc  b's  justifications 

if  both  views  are  teuninais 
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Figure  7.6.  Binding  Addition 
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Figure  7.7.  Binding  Retnovil 
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7.5.  Tools 

While  implemeniing  the  H?C  proioi>'pe,  we  enjoyed,  suffered  through,  or  craved  a  variety  of  tools  and 
programming  techniques.  Time  spent  in  building  basic  tools  or  "wasted"  in  disciplined  program  design  and  testing 
w’ill  be  more  than  repaid  in  reduced  overall  development  time  and  reduced  maintenance.  Sometimes  tools  and  tool 
building  make  the  difference  between  a  success  and  a  completely  unmaintainable  write-off.  This  is  as  true  of 
experimental  software  subject  to  frequent  and  rapid  c’.iangc,  such  as  HPC,  as  it  is  of  commercial  codes. 

I 

HPC  code  quality  is  high.  Most  package*^'  are  very  robust  and  easily  modified.  Some  of  the  tools  used  to  keep 
them  that  way.  along  with  some  of  the  failures,  are  worth  reporting. 

7J.1.  Slate  Table  n-s  Task.s 

As  mentioned  earlier,  two  substanually  different  methods  of  writing  multiplexed  programs  were  used:  a  table 
of  state  machines  and  a  collection  of  non -preemptive,  lightweight  tasks.  .A  state  table  can  be  implemented  using  the 
simplest  of  tools,  and  h  is  an  intuitive  approach.  Those  arc  its  only  merits,  and  a  sntall  invcsmient  in  tools  offers  a 
large  reward. 

The  hcan  of  the  tasking  package  is  a  simple  coroutine  package,  but  HPC  never  uses  the  coroutines  directly. 
Above  the  coroutines  are  queues,  a  round-robin  scheduler  task.  au'.d  routines  for  tasks  to  sleep  on  a  queue,  wake  up  a 
queue,  yield  control  to  the  scheduler,  and  terminate.  The  scheduler  task  runs  a  "backstop"  task  to  collect  external 
events  (like  host  I/O)  when  all  other  tasks  are  sleeping.  These  tasking  functions  are  convenient  and  unintrusive  to 
use. 

The  tasking  package  has  two  big  advantages  over  suic  ubics.  The  backstop  task  distributes  external  events 
without  knowing  what  to  do  with  them,  and  regular  tasks  just  wait  for  the  events  they  want  without  dealing  with  the 
distribution.  This  separation  makes  both  distribution  and  {m>cessing  of  events  cleaner,  easier  to  read,  and  easier  to 
extend.  This  is  more  important  that  it  might  seem  at  first  The  HPC  kerne!  redistributes  many  external  events  iwo 
or  even  th'ce  limes.  For  example,  the  backstop  task  waits  on  the  host  for  external  events,  the  process  manager  input 
task  waits  on  the  backstop  task  for  messages  from  (he  manager,  and  the  process  death  task  waits  on  the  manago^ 
input  task  for  death  messages.  Tasks  cleanly  separate  the  responsibilities  and  concerns  of  (he  three  ieveb. 

Second,  usks  encode  significant  flow  of  control  in  one  place  using  a  conventional  progranming  language.  In 
the  state  table  approach,  flow  of  control  is  encoded  in  the  data,  and  distributed  over  many  different  funebons.  It  is 
hard  to  distinguish  the  logically  separate  computabons.  and  hard  lo  manage  interactions  between  them  in  a  state 
table,  but  Mly  managed  using  queues  for  synchronization  with  asaociMed  data  structures  for  oommurjcabon. 
Development,  maintenance,  and  debugging  arc  all  much  more  difficult  for  sutc  ubies  than  tasks. 

Lack  of  compiler  support  is  the  only  notable  disadvantage  lo  lightweight  tasks.  Non-preemption  has  been  an 
advamoge.  making  every  block  of  code  an  implicit  critical  section  and  eUminaiing  the  need  for  nuisamcc  locks  on 
data  strucoirts.  rather  than  a  disadvantage.  The  underlying  coroutine  package  requires  architecture-  and  compiler- 
dependent  assembly  coding,  however,  the  original  package  was  earily  ported  lo  diverse  aichitectutes  such  as  VAX. 
MC68000,  and  ROMP  OBM  RT-PC). 

The  real  problem  is  stack  management.  A  (ask's  maximum  stack  depth  is  fixed  when  the  task  is  created. 
Stack  overflow  was  a  continuing  problem  during  developmenL  Overflow  can  not  be  auiomaucally  detected  wiLhou: 
compiler  support,  so  the  ta.^  package  was  extended  with  a  mechanism  to  measure  the  deepest  point  reached  on  a 
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Slack.  The  main  function  for  each  kind  of  task  is  coded  lo  measure  stack  depth  explicitly  after  each  iteration 
through  its  body  and,  unfortunately,  after  any  damage  has  been  done.  Packages  arc  recompiled  to  increase  the  stack 
size  associated  with  tasks  that  come  too  close  to  their  limits.  Many  tasks  have  bounded  stack  usage,  but  operation 
tasks  call  rcrursivc  routines,  so  any  limit  may  be  insufficient 

7.5.2.  Message  Libran 

The  .synchronous,  master-slave,  paradigm  offered  by  RPC  Ls  inadequate  implementing  for  the  generally 
asynchronous,  highly  multiplexed,  peer  relationships  between  the  component  processes  of  the  HPC  system. 
Therefore  wc  built  RPC-siubs  that  block  tasks  instead  of  processes  and  allow  the  flexible  distribution  of  incoming 
messages  to  existing  tasks  needed  inside  the  HPC  kernel.  Compiler  and  stub  generator  support  would  have  been 
welcome,  but  not  with  the  additional  baggage  carried  by  available  RPC  irnplcmcniaticris.  In  particulaf,  network 
transport,  message  encoding  and  decoding,  and  flow  of  control  had  to  be  managed  separately  in  the  HPC  kernel, 
while  these  are  all  unified  in  RPC. 

The  ideal  transport  medium  was  reliable  (perhaps  unordered)  debvery  of  messages  with  distinct  boundaries. 
Unreliable  deliver)’  of  messages  (UDP)  and  reliable  delivery  of  byte  streams  without  internal  boundaries  (TCP) 
were  available  when  implementation  started.  Wc  chose  to  build  ngidly  formatted  messages  on  top  of  TCP,  pray  for 
detection  of  malformed  or  uns)'nchronizcd  messages,  and  resynchronizc  after  errors  by  dropping  the  TCP 
connection.  This  was  considered  a  bener  investment  than  implementing  our  own  reliable  transmission  protocol. 

There  is  no  question  that  the  HPC  implementation  would  have  died  a  miserable  and  lingering  death  if 
application  messages  were  assembled  and  encoded  ’’manually".  To  centralize  byte  packing  and  conversions 
between  internal  and  external  representations,  a  message  library  was  written  that  would  take  a  message  buffer  and  a 
human-readable  format  string  and  scatter  or  gather  the  arguments  specified  by  the  format  Format  strings  use 
abstract  data  types  relevant  to  HPC  (shell)  rather  than  the  underlying  concrete  types  (ion;). 

The  self-documenting  format  feature  was  a  great  success,  explicit  management  of  message  buffers  was 
tolerable  when  packaged  correctly,  but  the  external  data  represenution  was  a  serious  mistake.  The  external 
representation  used  one,  two,  and  four  byte  quanrities  aligned  on  quannty  boundaries  from  the  beginning  of  the 
message,  under  the  impression  that  this  would  avoid  probletns  caused  by  host  data  alignment  requirements.  What 
actually  happened  was  that  entire  messages  had  to  be  aligned  to  be  read  properly,  and  that  alignment  had  to  be 
maintained  even  after  the  first  part  of  a  message  had  been  discarded. 

Eventually  a  solution  to  this  alignment  problem  was  neatly,  even  elegantly,  packaged  up.  but  fixed  size  data 
quantities  would  have  been  a  better  solution.  A  still  better  solution  would  have  been  to  use  an  existing  de  facto 
standard  encoding  like  XDR  or  Courier,  preserving  the  message  library  to  translate  between  HPC  abstract  types  and 
the  concrete  types  directly  supported  by  the  encoding. 

733.  Protocol  Grammar 

The  mess?5*.  library  deals  with  individual  messages,  but  does  not  simplify  programming  exchanges  of 
multiple  message:  <w  of  demultiplexing  interleaved  exchanges.  These  «t  problems  both  for  me  kernel,  and  for  sny 
realistically  complex  agent,  which  must  manage  several  concuntni  but  independent  strategies  for  different  portions 
of  Its  domain.  The  client  interface  provided  by  HPC  must  be  augmented  with  a  wide  range  of  programming  suppon 
tools  before  the  overall  system  is  practical. 
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The  user  inicrfacc  dialogue  grammars  under  dcvclopmcni  by  Yap  (ScY88]  look  like  aoraciive  tools  for 
managing  many  of  these  proiocol  problems  when  building  an  agent.  We  have  not  yet  applied  them  in  the  HPC 
system. 

7.5.4.  Test  Scaffolding 

No  exact  records  were  kept,  but  it  is  probable  that  die  disposable  test  scaffolding  used  for  HPC  development 
is  larger  than  the  KPC  rode  itself.  At  least  one.  and  usually  two,  test  processes  were  created  to  take  the  place  of 
clients,  kernels,  and  routers.  Each  appbcaiion  protocol  between  processes  with  its  corresponding  interface  libraries 
was  tested  independently  of  the  application  code,  and  IPC  terminal  setup  was  tested  with  nearly  every  combination 
of  dumm>  clients,  routers,  and  kernels  during  various  stages  of  development  The  earlier  test  processes  were  run 
interactively  to  step  tlu-ough  each  protocol  and  set  up  stress  cases.  Later  scaffolding  generally  ran  automatic  test 
sequences  to  verify  ihii  further  development  had  not  introduced  new  bugs. 

As  a  result  of  protocol  testing,  the  kernel  packages  that  handle  interactions  with  external  processes  were 
debugged  independently  of  the  central  database  operations.  The  layers  of  database  routines  lent  themselves  to  easy 
testing  from  the  structural  database  up.  As  (unctions  were  added  to  each  layer,  corresponding  test  functions  were 
added  to  a  test  suite  and  run  after  every  significant  change.  The  tim:  spent  running  the  test  suite  paid  for  itself  in 
development  lime  through  early  detection  and  good  isolation  of  errors. 

In  many  cases,  fully  exercising  a  layer  in  isolation  violates  global  structural  constraunts.  As  mentioned  earlier, 
the  path  maintenance  code  may  net  remove  direct  private  bindings  at  arbitrary  times.  Isolated  test  exercises  liad  lo 
be  gradually  removed  from  the  lest  suite  as  the  dependencies  between  layers  and  checks  for  additional  HPC 
cons.'jaint5  were  added  The  final  kcntel  !est  suite  contains  over  650  calls  on  the  low-level  semantic  and  database 
layers,  and  over  950  checks  on  the  results  in  the  structural  database.  This  is  larger  than  any  package  except  the 
low-level  semantic  layer,  and  none  of  it  is  used  in  the  kernel 

73.5,  Log  Files 

Log  files  arc  an  mvaluablc.  if  unexciting,  diagnostic  tool.  HPC  keeps  logs  with  adjustable  levels  of  reporting 
for  the  kernel.  IPC  router  and  process  manager.  The  globally  unique  number  generation  system  also  uses  a  per-bosi 
special  log  to  prevent  reuse  of  numbers 

A  standard  log  message  heading  with  the  date,  time,  and  logging  process  identifier  was  especially  helpful  in 
sorting  out  the  variay  of  messages  in  the  kernel  and  router  logs.  At  various  limes  during  devdopmem,  the  have 
recorded  debugging  traces,  task  stack  depths,  rough  ttatutg  estimucs,  diem  arrivals  and  depmtmts.  iatefnai  ttsk 
creatiocu.  dumps  of  valid  incoming  messages,  dumps  of  proiocol  violations,  resoiacc  consumption  reports,  hash 
table  sudstics.  rates  of  unique  number  generation,  host  fttal  erron,  and  violaied  rmemal  assertions.  Mundane,  but 
essential,  material. 

?3A.  Graphic  laterface 

Complicated  dynamic  activities  can  be  very  difficult  lo  understand.  Used  wisely,  paphic  displays  m6  user 
interfaces  can  make  sn  impossibic  task  practiaJ,  especially  for  a  system  like  HPC  with  a  namral  fiipNc 
rcpresmtation.  An  interacuve  graphic  inierCace  for  domain  agents  would  be  a  tremendous  cxpenmenal  tool. 
Un(onuna:el>.  good  graphic  mierfaces  are  a  major  investinem  in  development  time  and  iesourr.es.  and  client 
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support  tools  were  not  a  important  issue  in  this  thesis. 

Still,  a  simple,  non-inieractive  graphic  display  was  integrated  into  the  tasking  package  le  display  the  iutcriiaj 
status  of  the  kernel  conveniently.  Each  task  has  a  separate  marker  with  descriptive  labels.  The  dispby  places  the 
markers  for  different  t>'pcs  of  tasks  (client,  operation,  controller,  etc.)  in  diflerer.t  columns.  This  has  been  a  j 

valuable  tool.  The  degree  of  multiplexing  and  the  number  of  clients  is  manifest.  Premature  task  termination, 
failures  to  terminate,  and  unreclaimed  resources  are  insianil>  and  obviously  visible  on  the  display.  Log  files  provide 
the  same  raw  data  in  a  format  that  is  much  harder  to  use. 

73.7.  Interface  Preprocessor 

Interface  structure  descriptions  (medium,  orientation,  component  structures,  etc.)  are  complex  pieces  of 
information  that  must  be  manipulated  efficie^iUy  by  software,  transmitted  via  network  connections,  and 
communicated  to  and  from  human  beings.  There  are  three  corresponding  representations:  linked  graph  data 
structures,  linear  encoded  byte  sequences,  and  ASCII  strings. 

For  many  purposes,  a  structure  descriptions  is  convened  from  one  rcprcscniaiion  to  another  at  niniime.  ^ 

However,  this  is  inconvenirn:  for  programming  clients.  A  programmer  would  like  to  write  un  ASCII  descripbon 
inline  in  a  prognun,  and  have  it  convened  to  a  useful  form  during  compilanon.  A  simple  source  code  preprocessor  ^ 

takes  inline  HPC  descripbons  and  converts  them  into  C  language  arrays  initialized  to  the  linear  encocUr.g  fer  the 

1 

descripbon.  This  form  can  be  passed  directly  to  the  clieni-kcmcl  interface  library,  which  is  the  most  common  use  of 
inline  descripbons. 

j 

( 

733.  Code  Assertions 

The  low-IcvTl  semanucs  and  stnictural  database  packages  must  maintain  many  invariant  properties  in 
preserve  HPC  properbes  and  avoid  corrupuon  of  data  structures.  Assertions  about  these  invarianu  are  a  significant 
fi^bon  of  the  exccuuble  code  in  thoe  layers.  Seven  percent  of  the  low-level  se4nanbcs  layer,  and  12  perceni  of 
the  database  layer  is  devoted  to  checks  that  arc  usually  unnecessary  and  often  redundant. 

However,  the  cost  of  invariant  checking  is  itKignificaitf  compared  to  Che  development  and  maintenance  bme  it 
saves,  not  to  mention  the  increased  confidence  r  inspires  in  ocnipUcaird  database  manipulabons.  On  dozens  of 
occasions,  an  apparently  innocuous  code  addiber  ^  change  violated  an  assertion.  Usually,  the  violations  woe  the 
result  of  incorrect  coding  that  was  acceptable  *  v  mu  ct  of  the  rooda’e  beii.r  added,  but  inconea  in  the  larger 
context  of  the  enure  database.  Asserted  invy  were  .  ^moat  never  too  itsnioivc;  the  exceptions  were  due  lo 
either  a  lack  of  forethought  in  specifying  c'  i  met  radical  change  in  knplementtbon  requiru^  giobtl 
changes  throughout  the  database. 
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8.  Conclusions 

Wc  began  wiih  three  general  goals:  develop  a  smiciural  rcprescniaiion  for  target  applications,  provide 
operations  tc  manipulate  the  representation  during  execution,  and  identify  specific  influences  of  the  distributed 
environment  on  application  structure  and  management  As  in  any  research,  the  successful  pursuit  of  initial  goals 
leads  to  unexpected  conclusions  and  suggests  goals  of  fuuire  research.  Here  we  present  some  general  conclusions 
on  system  design.  Wc  also  suggest  several  research  areas  ripe  for  additional  work. 

8.1.  General  Observations 

D  iring  this  research,  wc  came  to  some  conclusions  on  system  design  that  apply  widely. 

8.1.1.  Dynamic  Structure 

•  Use  semantics,  not  syntax,  to  describe  dynamic  structure. 

The  ty'pical  language -based  approach  to  distributed  programming  handles  static  process  structures  well,  while 
handling  open  systems  and  run-ume  reconfiguration  poorly,  if  at  all.  Dynamically  cha.iging  structure  should  be 
represented  as  an  abstract  data  structure  with  a  set  of  manipulating  operations,  not  as  a  syntactic  form  in  a  program. 

The  HPC  design  successfully  dj^monstnies  the  dau  structure  approach,  encoding  a  broader  set  of  process 
structures  than  any  pregranruning  language  we  know.  HPC  also  tackles  dynamic  changes  and  merge 
mconsistencics,  which  can  not  even  be  expressed  syntactically.  Future  distributed  programming  languages  should 
not  attempt  lo  encode  process  stniccure  syntactically,  unless  the  process  structure  is  to  be  entirely  fixed. 

8.1.2.  Process  Structure 

•  Passive  hierarchies  are  an  appropriate  model  for  overall  application  stnicture. 

Many  distributed  applic.auons  can  be  naturally  described  as  a  nested  hierarchy  of  abstractions.  Wc  find  passive 
hierarchies  superior  to  active  ones  or.  the  basis  of  clean  abstraction  and  fault  tolerance  through  redundancy. 
However,  p.'^ssive  hierarchies  do  not  encompass  all  useful  application  smjciurts.  Most  noubly: 

•  A  comprehensive  model  of  process  structure  requires  non-hierarchical  features. 

A  stnci  hierarchy  with  explicit  composition  is  too  doaered.  too  concrete,  and  too  briole  to  support  complex 
applications  in  an  open  environmenL  While  programmers  and  manages  may  be  presented  with  the  appearance  of  a 
stnet  hier&rchy.  practical  systems  reqture  controlled  violations  of  this  paradigm  to  provide  transparency.  Also,  trees 
are  intrinsicalty  mcapablc  of  expressing  the  merge  tnconsistmcics  that  they  cm  generate.  It  is  fiv  beoer  lo  use 
arbitrary  grapru  as  i  basic  structural  model,  and  impose  a  hierarchy  as  a  surface  feature,  than  to  use  trees  as  a 
fundamct'iai  txM.  Thi:.  conclusion  was  not  expected,  but  our  hierarchically  motivated  design  was  not  complete 
•Ad  orisisteni  until  wc  adopted  graphs  as  the  unMying  moJ  M. 

8.U.  Managctncnt  Structure 

•  The  relation  A  is  as  important  as  A  cotfumtiucureiwtrhB. 

Powerful  tools  are  needed  lo  describe  and  dynamically  m«iipulaie  this  relationship  between  agents  and  domains.  In 
the  context  of  HPC.  this  observation  sugi^sted  the  reuse  of  exisung  compositional  tools.  By  adding  abstract 
placeholders  for  each  domain,  the  complete  protection  relation  can  be  described  in  the  same  explicit  (!eui]  as  the 


109 


communication  relation. 

Before  reusing  composition,  we  investigated  a  number  of  inheiiiance  and  default  rules  for  propagating 
pnvilege  from  an  abstract,  multiproccss  object  to  (some  of)  its  real  processes  at  the  leaves.  All  rules  led  to  conflicts 
vkith  th'!  basic  principles  of  abstraction  and  composiuon.  In  contrast,  the  division  of  uic  hierarchy  into  domains  that 
follow  object  boundaries  was  an  obvious,  and  satisfactory,  design  oecision. 

This  observation  applies  widely.  A  great  many  tools,  ranging  from  neiv'ork  protocols,  through  opcraiing 
systems  and  programming  languages,  give  close  control  oveh binding  two  or  more  communicating  entities  logeiher. 
In  contrast,  the  tools  for  specifying  proteebon  and  management  relabonships  are  crude  and  limited  in  most 
cnvironmcnLs.  In  designing  .any  new  software  system  with  dynamically  changing  structure,  the  same  degree  of  care 
should  be  given  to  a  powerful  set  of  tools  for  relabng  agents  with  domains  as  to  the  rest  of  the  system  design. 

8.1.4.  Communication  Structure 

Communicabon  funcuons  can  be  classibed  as  logical  configurauon,  physical  impierr.^ntauon.  and  end-to-end 
communicauon. 

•  Each  class  of  communicauon  funebon  should  be  provided  independently. 

Workers  communicate,  managers  configure,  and  the  HPC  kernel  implements.  This  separabon  is  not  provided  by 
current  IPC  mechanisms  and  operating  systems.  One  specific  consequence  of  this  classificaUon  is  that: 

•  Efficient  separabon  of  implemer.tabon  from  communicabon  requires  a  third-party  connect. 

We  can  not  acuvely  manage  i  distributed  C43mputibon  without  making  changes  to  it.  and  existing  network  protocols 
do  not  allow  an  HPC  kernel,  for  example,  to  set  up  or  tear  down  connections  between  workers  in  an  efficient  way. 
Third-party  cormect  is  an  important  session  layer  featine  that  should  be  incorporated  in  future  protocol  suites. 

•  Complex  communicaba'i  patterns  can  be  expressed  structurtdly. 

In  contrast  to  iinplementabot.  (third-party  connect),  it  is  not  necessary  to  suppon  configuration  in  the 
communicauon  protocol.  The  paradigm  of  point-to-point  connections  between  interfaces  does  not  limit  a  system 
one-to-one  communicabon  pauems.  It  is  not  necessary  to  rely  on  details  of  addressing  or  routing  to  manipulate 
heterogeneous  parallel  channels,  homogeneous  multiplexing,  or  mulbcasting. 

•  On-iine  compuubon  of  ctmnectivicy  ts  pracbcal.  but  non-trivial. 

The  most  iniertstiiig  algorithm  in  the  HPC  implementation  b  the  incrementa)  computation  of  communicattng  peers, 
in  which  the  HPC  konel  converts  configuniion  information  tmo  implemectation  deebions.  CXr  best  algorithni  b  a 
centralized  one  that  computes  the  effects  of  connect  and  dbconnect  in  lime  proportional  to  the  (cngrA  of  all  the 
affected  paths.  We  unsuccessfully  sought  an  ligoriihm  with  cost  proportional  to  the  number  of  affected  paths 
(effectively  oonstam  time).  We  would  also  prefer  an  deoenoiiixed  algorithm  that  could  be  dtsmbtaed. 

The  HPC  path  maintenince  algorithm  b  similar  to  on-line  algorithms  for  naAStbve  closure,  a  formal  propeny 
with  many  pri|pbcal  applications.  Therefore,  we  expect  path  maunenance  lo  find  use  outside  the  nontext  of  HPC. 
Path  maintenance  also  has  many  potenbil  applicabciAS  in  dismbuted  networit  routing  algorithms. 
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8.1.5.  Distribution 

•  Disiribuiion  mandates  an  asynchronous  system  interface. 

Thi.s  point  is  controversial,  but  our  experience  is  that  synchronous  interfaces  are  appropriate  only  for  systems  with 
centralized  behavior.  To  capture  the  essence  of  distribution,  one  must  allow  for  the  intrinsic  asynchrony  of 
multiprocess  programs,  and  of  failures.  Transactions  (and  atomic  non-primidve  actions)  are  not  consistent  with 
highly-available  access  to  a  distributed  data  structure,  because  multiple  agents  may  be  inspecting  and  modifying  a 
shared  data  structure  concurrently.  Asynchronous  notifications  of  change  arc  needed  to  avoid  expensive  polling, 
just  as  interrupts  are  needed  to  support  efficient  operating  systems.  Support  for  multithreaded  agents  also  requires 
an  asynchronous  interface,  to  allow  overlapping  operations  issued  by  several  threads  of  a  single  agent. 

Building  synchronous  interfaces  for  programmer  convenience  on  top  of  the  basic,  asynchronous,  system 
interface  is  compatible  with  a  distributed  environment.  For  c’^amplc,  a  progranjning  environment  might  use  several 
lightweight  processes  to  wail,  synchronously,  for  each  of  several  overlapping  operations  to  complete.  But  such 
programming  environments  arc  successful  by  virtue  of  what  they  hide.  Eve.  aially  someone  will  need  access  to  the 
ugly  asynchronous  realit)’,  even  if  only  to  build  a  better  environment. 

•  Partitions  do  not  require  reduced  availability. 

Put  another  way,  if  you  know  what  you're  doing,  you  can  do  it  more  often.  Most  databa.ses  understand  nothing 
about  the  semantics  of  the  data  they  contain,  and  therefore  cannot  resolve  merge  inconsistencies  in  a  sensible  way. 
As  a  resul!.,  database  designers  strictly  avoid  consistency  problems  by  limiting  availability.  The  more  that  is  known 
about  the  application,  the  more  restrictive  this  strategy  becomes. 

The  Locus  distributed  file  system  is  a  specialized  database  that  knows  the  semantics  of  much  of  its  data,  and 
exploits  that  knowledge  to  reconcile  automatically  many  inconsistencies  at  merge  time.  IIPC  maintains  a  richer, 
even  more  specialized,  database,  and  understands  almost  everything  about  its  data  (at  the  expense  of  containing  only 
specialized  data).  Merging  is  based  solely  on  current  partition  state,  using  a  history-less  algorithm,  and  most 
structural  features  can  be  reconciled  automatically. 

Our  experience  suggests  that  application  complexity  is  not  a  basic  obstacle  to  availabiUty  during  partitioned 
operation.  In  fact,  we  offer  this  heuristic  to  generalize  the  comments  just  made: 

•  The  greater  the  number  of  internal  constraints  a  specification  has,  the  fewer  the  external  constraints  an 
implementation  will  have  to  add  to  operate  in  a  failure-prone,  partitionable  environment 

8,2.  Suggestions  for  Future  Research 

Our  experiences  with  HPC  suggest  several  areas  for  investigation,  including  specific  improvements  to  HPC, 
general  network  services,  and  semantic  models  for  concurrent  programs. 

8,2.1.  Design  Extensions 

It  is  usually  hazardous  to  allow  more  features  to  creep  into  a  satisfactory  system.  However,  there  are  at  least 
two  areas  of  the  HPC  design  where  additional  features  deserve  investigation. 

•  Allow  user-specified  correspondence  of  views. 
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HPC’s  fixed  definition  of  corresponding  views  leads  to  the  tap  problem,  the  user  inability  to  delect  cycles,  and 
related  problems.  It  may  be  possible  to  find  one  mechanism  that  provides  solutions  to  this  whole  set  of  problems. 
For  example,  users  could  assign  endpoints  lags  or  markers  that  would  propagate  along  paths  and  define 
corresponding  and  reachable  views.  Integrating  such  a  scheme  into  the  design  of  complex  communication  paths 
(e.g.,  multicasting),  and  the  implementation  of  path  maintenance  is  the  technical  challenge. 

•  Place  limits  on  splice  targets. 

As  previously  noted,  hidden  communication  paths  are  an  improvement  over  strict  hierarchies,  but  they  are  not 
alw-ays  a  good  thing.  It  is  easy  to  express  limits  on  the  targets  of  splices,  for  example:  ’’must  be  within  subuiec  T”. 
But  it  is  not  obvious  how  to  check  such  constraints  quickly,  nor  how  to  integrate  them  into  the  existing  protection 
system.  In  the  current  HPC  design,  no  domain  can  limit  what  another  domain  does  internally.  One  domain  can 
impose  its  will  on  an  adjacent,  inferior,  domain  only  by  taking  full  conuol  over  the  inferior  domain. 

8.2.2.  Related  High-Level  Services 

HPC  provides  a  structuring  service.  Alone,  it  is  not  sufficient  to  build  complex,  distributed  applications. 
Many  of  the  other  necessary  services  (transport  protocols,  file  service,  name  service,  remote  execution)  already  exist 
in  most  environments,  but  we  found  the  need  for  some  network  services  not  yet  available. 

•  Dynamic  property  (arbitrary  string)  service 

HPC  currently  maintains  two  uninicrpreted  properties:  role  and  rype  names.  This  was  pragmatically  the  right  thing 
to  do.  but  wrong  in  principle.  Users  should  be  able  to  attach  arbitrary  properties  to  structural  items,  as  long  as 
proper  operation  of  HPC  does  not  depend  on  them,  and  the  HPC  kernel  need  not  be  extended  to  suppon  them. 
Instead,  properties  should  be  stored  in  a  name  service  allowing  dynamic  registration  of  arbitrary  data.  The  DARPA 
domain  name  server  demonstrates  the  technology  needed  to  support  name  lookup  for  a  restricted  class  of  properties 
changing  fairly  slowly.  The  proposed  X.500  directory  service  has  optional  support  for  unrestricted  properties,  but 
vigorous  development  is  needed  in  this  area,  especially  to  allow  users  to  quickly  and  automatically  establish  na:ning 
sub-domains. 

•  Network-wide  credentials  and  authentication 

HPC  uses  only  TCP  sequence  numbers  and  IP  source  addresses  to  authenticate  corr  nunication  between  components 
of  the  system.  This  is  very  weak  protection  against  a  spoofing  attack  on  the  HPC  implekuentation.  Additionally, 
HPC  has  nc  notion  of  user  identity,  and  cannot  provide  its  host  sites  with  information  needed  for  access  control  and 
lesouice  accounting  purposes.  This  brings  access  to  resources  down  to  the  lowest  level:  anonymous  guest 

Tradiuonal  resource  sharing  on  the  Internet  is  accomplished  by  creating  local  user  accounts  within  an 
administrative  boundary  (e.g.,  MIT  Multics),  and  using  local  authenticatian  (login  during  telnet)."  Using  password 
challenges  when  programs,  rather  than  people,  must  be  authenticated  is  a  bad  idea,  because  pfograms  can  be 
examined  for  password  strings.  At  a  minimum,  a  network-wide  credential  and  authentication  scheme  is  needed 
before  any  significant  cutomaied  resource  sharing  can  be  done  across  administrative  boundaries.  The  Koberas 

"  The  original  Arpanet  vuiom  of  disthbuted  rcaouroe  ihanng  have  never  mUy  been  fulfilled.  With  few  exoepuoni,  die  networiiing  oom* 
muniuei  have  nepped  ihon  at  electronic  mail,  file  tnntfer,  and  lenioie  login. 


112 


authentication  system  used  in  Project  Athena  is  a  good  starting  point  for  further  work. 

•  Session  layer  support  for  configuration 

The  ihird-pany  connect  facility  we  found  so  important  is  a  basic  configuration  feature  that  belongs  to  the  session 
layer.  We  expect  dynamic  reconfiguration  of  distributed  applications  to  require  a  range  of  session  layer  features 
beyond  third-party  connect,  just  as  the  current  ISO  proposals  for  session  layer  synchronization  extend  far  beyond 
the  original  concept  of  data  quarantine.  Current  work  into  automatic  networic  management  should  be  broadened  to 
consider  the  necessary  protocol  support  for  automatic  application  management. 

•  Software  development  tools 

HPC  provides  a  raw,  low-level  environment,  as  expected  of  a  set  of  basic  mechanisms.  Tools  that  incorporate  some 
policies  and  allow  programming  at  a  higher  level  are  needed,  even  at  the  expense  of  generality.  The  most  glaring 
example  is  the  lack  of  a  standard  interactive  utility  for  HPC  analogous  to  the  many  shell  programs  for  the  Unix 
operating  system.  The  development  of  automated  agents,  perhaps  customized  for  particular  applications,  is  a  more 
challenging  research  area  that  brings  theoretical  studies  of  distributed  algorithms  together  with  systems  engineering 
and  implementations.  Another  interesting  problem  area  is  the  integration  of  HPC  mechanisms  with  conventional 
fault  tolerance  mechanisms  (transxtions,  redundancy,  recovery). 

8.2  J.  Semantics  and  Formal  Directions 

HPC’s  need  to  manage  a  strict  hierarchy  with  an  undirected  graph  model  suggests  some  extensions  to  formal 
semantic  models,  as  well  as  the  pragmatic  tools  discussed  above. 

•  Remove  the  parent-child  asymmeuy  in  formal  studies  of  semantics. 

Many  formal  studies  of  the  semantics  of  concurrency  are  based  on  passive  process  hierarchies  as  in  CCS  and  CSP. 
The  axioms  of  composition  in  such  systems  describe  the  behavior  of  a  complex  node  as  a  function  of  the  behavior 
of  its  children.  However,  the  parent-child  relationship  is  partly  a  matter  of  perspective.  Any  node  can  be  chosen  as 
the  root  of  a  CCS  or  CSP  tree  without  affecting  its  behavior.  The  tree  has  exactly  the  same  leaves  composed  in 
precisely  equivalent  ways,  no  mancr  which  node  is  selected  as  the  root,  and  therefore  must  have  the  same  behavior. 
The  sets  of  equations  describing  the  various  orderings  of  a  tree  often  appear  quite  differenL  The  laws  of  distribution 
for  a  formal  system  must  be  sufficient  to  prove  that  aU  such  sets  are  exactly  equivalent 

•  Investigate  semantics  of  nonhierarchical  structures. 

CSP  and  CCS  systems  cannot  express  sharing  or  iran^Mient  abstraction.  Tbere  is  only  one  path  of  interaction 
between  two  processes  and  all  interactions  between  them  are  visible  all  along  the  path.  There  seem  to  be  two 
technical  obstacles  to  providing  passive  graphs  with  a  formal  semantics.  The  first  is  infinite  families  of  equations,  or 
of  solutions  to  equations,  due  to  cycles  in  the  graph.  The  second  is  tte  loss  of  strictly  local  composition. 

Our  experience  with  HPC  shows  that  families  of  fonna!  solutions  can  be  detected  and  reduced  to  a  single 
representative,  or  discarded  if  no  concrete  solution  exists.  We  conjecture  that  this  experience  can  be  extended  from 
equations  of  connectivity  to  equations  of  behavior.  The  loss  of  local  compositioo  is  more  apparent  than  real.  Every 
direct  interaction  between  two  nodes  is  explicitly  represented  by  an  edge.  Solving  the  equations  for  cycles 
automatically  handles  any  indirect  interactions. 
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8.2.4.  Distribution  and  Decentralization 

A  production  qualit>’  HPC  system  would  require  significant  improvements  on  the  prototype  implementation 
we  constructed.  Most  of  the  work  must  go  into  application  managers,  and  the  HPC  kernel  also  needs  some  revision. 
However,  major  improvements  are  beyond  our  current  understanding  of  decentralized  conu’ol,  and  we  propose  some 
research  needed  to  suppon  the  development  of  distributed  agents. 

•  Decentralize  the  HPC  kernel. 

The  HPC  interface  to  distributed  applications  and  managers  is  satisfactory  at  this  point,  but  the  HPC  implementation 
is  too  centralized.  The  implementation  is  built  from  several  distributed  processes,  but  the  cuirent  HPC  kernel  can  be 
neither  replicated  nor  decentralized. 

A  preliminary  investigation  of  a  decentralized  kernel  indicated  that  many  kernel  functions  could  be  readily 
distributed.  The  obvious  way  to  divide  the  physical  database  is  along  logical  domain  boundaries,  replicating  copies 
of  a  domain  only  on  hosts  with  a  physical  agent  for  the  domain.  The  key  problem  area  is  decentralization  of  the 
algorithms,  rather  than  data.  In  particular,  it  is  not  obvious  how  to  distribute  the  path  maintenance  algorithm 
without  communicating  the  entire  path  maintenance  graph. 

•  Investigate  decentralized  control. 

This  thesis  explores  a  set  of  mechanisms,  without  presenting  policies  for  their  use.  We  have  assumed  polices  arc 
determined  by  agents’  behavior,  outside  the  scope  of  our  study.  Behind  this  assumption  is  a  challenging  research 
area. 

Any  robust  application  must  have  multiple,  distributed  managers.  Those  managers  must  collectively  agree  on 
policy,  and  must  further  agree  which  particular  manager  is  responsible  for  executing  policy.  Various  specol  aspects 
of  decentralized  control  have  been  studied  in  different  fields:  distributed  agreement  in  the  areas  of  theoretical 
distributed  computing  and  reliable  sy.siems  engineering,  distributed  control  in  the  fields  of  industruL  ag^nerting  and 
applied  mathematics,  distributing  an  invariant  in  theoretical  distributed  com*:’iting,  flow  and  congesdc  -i  xntrol  in 
protocol  development  and  system  modelling,  and  so  on. 

These  fields  of  intense,  specialized  research  can  all  contribute  to  the  study  of  the  stability,  efficiency,  and 
correctness  of  decentralized  manipulations  of  complex  discrete  structures.  As  a  specific  example,  we  propose  the 
decentralized  tree  editing  problem  for  study.  The  weU-known  conventional  tree  (or  string)  editing  problem  is  to 
take  two  trees  (strings)  and  a  collection  of  editing  primitives,  and  determine  an  optimal  sequence  of  primitives  to 
transform  one  tree  into  the  second.  This  abstract  problem  has  practical  applications  in  network  management,  failure 
recovery,  and  software  developmenL 

The  decentralized  tree  editing  problem  must  be  solved  by  multiple,  communicating  agents.  Changes  to  both 
trees  may  occur  asynchronously,  and  different  agents  leam  of  changes  at  different  times,  perhaps  in  different 
relative  orders.  They  may  not  share  a  centralized  database,  and  a  locking  facility  on  the  trees  is  strongly 
discouraged.  Agents  may  not  exchange  the  entire  problem,  only  the  minimum  needed  to  coordinate  their  actions. 
Ideally,  a  single  agent  attempts  each  necessary  opendon,  and  non-conflicting  operations  are  done  concurrently  by 
different  agents.  The  solution  should  permit  dynamic  addition  and  removal  of  agents,  as  well  as  tree  elements. 
Further,  the  solubon  must  be  stable,  resolving  conflicts  between  agents  quickly.  These  aspects  of  the  dxentralized 
problem  must  be  added  to  the  existing  correctness  and  opumality  issues  of  the  conventiocul  problem. 
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•  Explore  applicaiion-spccific  impacts  on  control. 

Some  applications  can  do  useful  work  while  partitioned  or  survive  the  replacement  of  components  without  special 
attention,  while  others  must  be  explicitly  resynchronized  when  reconhgured,  and  still  others  cannot  tolerate  any 
visible  failures  or  changes  at  all. 

HPC’s  mechanisms  are  sufficient  to  control  the  first  group.  For  the  latter  groups,  HPC  must  be  supplemented 
by  mechanisms  for  manipulating  application  state,  for  example,  atomic  transactions.  An  agent  for  an  HPC  domain 
must  use  these  other  mechanisms  correctly.  It  may  be  cgnstiained  by  certain  HPC  operations,  because  the 
application  under  Its  control,  or  the  supplementary  mechanisms,  cannot  tolerate  the  results. 

Atomic  transactions  accommodate  the  most  fragile  applications,  while  real-time  applications  usually 
accomodate  the  harshest  environments.  However,  an  application  and  its  environment  can  make  partial  allowances 
for  each  other,  and  the  resulting  systems  may  prove  more  efficient  than  either  extreme. 

83,  Reprise 

A  typical  distributed  application  has  a  hierarchical  structure  with  well-defined  communication  patterns 
between  loosely  coupled,  active  computation  elements.  The  distributed  environment  is  an  open  system  composed  of 
autonomous,  heterogeneous,  asynchronously  running  sites,  subject  to  indq)endent  failure  and  network  partition. 

HPC  is  a  study  of  the  use  of  nested  process  abstractions  and  explicit  composition  to  represent  such 
applications.  Maintenance,  migration,  debugging,  and  adaptation  to  changing  environmental  conditions  are 
supported  by  HPC  operations  that  modify  process  structure  during  execution. 

The  HPC  design  emphasizes  the  structural  or  architectural  issues  in  distributed  software,  especially 
interactions  involving  dynamic  reconfiguration,  protection,  and  partition.  The  contributions  of  this  work  come  from 
the  detailed  consideration  of  how  the  seemingly  well-known  features  of  abstraction  and  composition  interact  with 
each  other  and  a  distributed  environment. 

This  thesis  is  also  a  rare  case  study  in  consistency  control  for  non-trivial,  highly-available  services. 
Operations  modifying  structure  are  fully  available  during  network  partitions.  The  inconsistencies  that  may  be 
encountered  during  merge  have  all  been  identified.  Each  problem  is  either  avoided,  automatically  reconciled  by  the 
system,  or  reponed  to  users  for  application-specific  recovery. 


Appendix  A 
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A.  Formal  Description 

We  begin  this  Appendix  with  a  proof  of  the  characteristic  theorem  used  to  express  a  collection  of  dynamically 
changing  relations  as  subsets  of  a  fixed  relation.  This  theorem  is  critical  for  our  success  in  treating  most  dynamic 
structure  as  formally  immutable  and  thereby  eliminating  many  sources  of  inconsistency. 

The  remaining  dynamic  effects  are  modelled  by  treating  each  structure  as  a  formal  sentence  and  the 
operations  that  transform  structures  as  formal  axioms.  Section  A.2  gives  the  simplified,  abstract  form  of  structure 
used  throughout  this  Appendix. 

The  predicates  that  define  legal  HPC  sinictures  provide  this  formal  system  with  extensions  in  a  simple  model. 
A  legal  structure  is  a  true  sentence.  The  constraints  that,  for  example,  force  strict  nesting  of  objects  are  all  encoded 
in  Section  A.3. 

Section  A.4  defines  some  core  operations  (simpler  than  the  primitives  provided  to  clients)  and  then  reduces 
the  client  primitives  to  core  operations.  Depending  upon  its  arguments,  each  client  primitive  may  translate  into  an 
arbitrary  number  of  core  operations  (e.g.,  destruction  of  a  complex  subtree). 

Every  derivation  of  a  sound  fomial  system  from  a  true  sentence  results  in  a  true  ;‘entence.  A  formal  system  is 
complete  in  a  strong  sense  if  it  can  derive  every  true  sentence.  Space  does  not  permit  full  proofs  of  HPC’s  formal 
soundness  and  completeness,  but  Section  A.5  outlines  such  proofs. 

A.l.  Characteristic  Theorem 

HPC  increases  the  number  of  formally  inunutable  properties  by  distinguishing  local  knowledge,  which  may 
change,  from  global  truths.  Many  dynamic  features  of  process  structure  can  be  limited  to  creation  and  destruction 
of  otherwise  static  elements.  We  pretend  these  elements  have  fixed  properties  throughout  an  infinite  lifetime,  and 
that  we  only  become  aware  of  them  when  they  are  created,  and  lose  awareness  of  them  when  they  are  destroyed. 
The  set  of  elements  known  at  any  time  and  place  is  a  characteristic. 

When  it  is  possible  to  use  them,  dynamic  local  characteristics  offer  a  major  advantage  over  dynamic  global 
relations.  If  the  structure  visible  in  a  partition  is  described  by  a  local  completely  known  relation,  and  this  local 
relation  is  formally  defined  as  the  intersection  of  a  global  (and  incompletely  known)  relation  with  local,  known 
characteristics,  then  partitions  can  be  merged  simply  by  taking  the  set  unions  of  the  local  relations  and  the  local 
characteristics. 

This  property  has  important  conseqeunces.  The  global  sets  and  relations  axe  never  needed,  only  the  local 
ones.  This  permits  complete  distributed  management  of  stmcture.  The  merge  procedure  for  local  relations  is 
extremely  trivial,  yet  guaranteed  to  preserve  the  formal  definition  (and  consistency)  of  the  local  relations.  Finally, 
there  is  only  a  weak  constraint  governing  relations  and  characteristics.  A  relation’s  content  is  almost  irrelevant,  so 
many  different  structural  relationships  can  be  managed  this  way. 

Given 
R  c  Xx  Y 

Let  R  be  a  binary  relation  over  sets  X  and  Y.  These  describe  the  global,  immutable  (and  never  completely 

known)  "truth"  about  structure. 
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c-X,  c  X ,  c-Y,  c  Y 

In  each  paniiion  i ,  there  are  local,  dynamic  characteristic  sets,  describing  the  structural  elements  known  at 
this  time  and  place. 

c-R.  =  Rn(c-X,  xc-YJ 

A  local,  dynamic  structural  relation  is  formally  defined  as  the  intersection  of  a  global  relation  with  the 
corresponding  local  characteristics. 

c-X^jc  A  Rx  >’  =5  c*Y,> 

There  is  one  constraint  governing  local  characteristics  and  the  global  relations.  Tne  image  of  a  characieristic 
b>  the  relation  must  also  be  characteristic.^^  This  is  a  formal  way  of  saying  that  we  must  understand  the 
answers  to  any  questions  we  can  pose. 

Theorem 

c-R,uc-R^  s  Rn((c-X,uc*X;)x(c-Y,uc-Y^)) 

The  formal  statement  of  the  theorem  for  binary  relations  and  two  partitions.  Merging  two  local  relations  is 
identically  the  fresh  intersection  of  the  global  relation  with  the  merged  local  characteristics. 


Proof 

To  simplify  the  proof  we  introduce  some  abbreviations,  and  freely  mix  relational,  set  and  predicate  notations. 
No  confusion  should  result.  The  extensions  to  n  -place  relations,  and  arbitrary  numbers  of  partitions  are  completely 
straightforward. 

A  i  Rx  y  =  c-X.x  C  =  c-Y.y  D  ±  c-XyX  £  =  c-Y^y 

First  direction:  (c-R,  uc-R^)xy  -  (Rn((c-Xjuc-Xy)x(c-Y,uc-Yy)))x  y 


1) 

(c-R,uc-R^)xy 

assume 

2) 

[A  a£  aCI  vlA  aZ)  a£) 

1,  definition 

3) 

A  a(£  vD)a(C  v£)a[(£  v£)a(C  vD)] 

2,  deMorgan 

4) 

A  a(£  vD)a(C  v£) 

3,a.E 

5) 

A  a(c-X.uc-X^)x  A(c-YiUC-Y^)y 

4,  definition 

6) 

( R  n  (( c-X.  u  c-X, )  X  ( c-Y.-  u  c- Y^ )))  x  y 

S,  definition 

Other  direction:  (Rn((c-X|UC-Xy)x(c-Y,uc-Yi)))xy  =>(c-RiUC-Ri)xy 

7) 

A  a(£  vD)a(C  v£) 

assume,  def 

8) 

(A  a£)v(A  aD) 

7,A.E,deM 

9) 

A  aB  C 

given 

10) 

C  vD 

8.9.MP,dcM,A.E 

11) 

A  aD  =>£ 

given 

12) 

B  v£ 

8,ll,MP,deM.  a-E 

13) 

A  a(B  vD)a(C  v£)a((B  v£)a(C  vD)] 

7, 10. 12.A.1 

14) 

(c-RiUC-R^)xy 

deM,def 

“  Thu  oonnnini  U  Mynunetne.  Only  one  tfonuin  of  die  iclttkii  hu  u>  take  (he  /ok  of  X.  For  II  -fiaai  idmiau  vUi  II  >  Z  idi 
domam can oonstmn  the  oUien  cither di/ealy. or  iran«aively(c-X^X  .a  Rx  y  f  C-Y,y;  C-Y,y  a  Rxy  2  C-Z,2). 
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A.2.  Formal  Structure 

HPC  formal  siniciurc  can  be  divided  into  core  and  derivative  relations,  and  into  immutable  and  dynamic 
relations.  The  core  relations  are  the  ones  directly  manipulated  by  HPC  primitive  operations,  uhile  the  derivative 
relations  describe  the  more  complex  consequences  of  simple  operations.  In  this  Section  we  use  PROLOG  to  define 
the  derivative  relations  in  terms  of  the  core. 

The  'mmutable  relations  describe  fixed  properties  for  which  inconsistencies  can  never  arise.  Normally,  only 
implement 2tion-5pecific  constants  would  be  core  and  immutable,  but  this  subjects  too  much  structure  to  merge 
inconsistencies  (Chapter  6).  When  the  characteristic  theorem  is  exploited,  only  three  core  dynamic  relations  are 
formally  manipula-cd  by  HPC  operations  or  require  non-trival  reconciliation  after  network  merges. 

A.2.1.  Core  Immutable  Relations 


Primitive  Structural  Elements 

view(V) 

darairCD) 

Shells,  splices,  interfaces,  connections,  controllers,  and  protection  boundaries  are  all  formally  reduced  to  relations 
on  views  and  domains.  These  relations  are  only  known  partially,  through  the  use  of  local  characteristic  sets  of 
views  and  domains.  There  is  a  reserved  domain  nxt. 

systenQ) 

Processes,  controllers,  and  the  rent  domain  form  a  distinguished  subset  of  all  domains. 


Primitive  IPC  Properties 

IPC  properties  are  known  global ly  and  immutably. 

nrrilunM 

/•  K  is  i  Tjpponcd  IPC  ,-wcria.-.i:s'  •/ 
oriental icn  (C) 

/•  K  is  a  sjfjToned  PC  cLnectiar.  •/ 

The  supported  mechanisms  and  directions  are  implementation  dependent,  but  must  include  ihe  reserved  mechanism 

oontrol. 


rcvex»(01.  02) 

/*  01  and  02  art  oorplanertary  oriantatkns  */ 

This  relation  must  be  symmetric  and  pseudo-emsitive  (odd  length  sequences  must  be  closed  under  the  relation). 
This  ensures  that  end-to-end  chains  that  are  locally  oomplementary  on  every  link  have  complementary  endpoints. 

atructura({T,  P.  (H«  0]]) 
atruotundT.  P.  (S,  S.  ...]]) 

/•  7  is  •airple“.  •budie".  •njUiplot".  or  "Vulticaat* 

•  P  is  •^.ensior,*.  or  ^iMkad" 

*  isadiimM,  onsntat ion (0) ,  stnicturaCS) 

•/ 


A  simple  view  structure  specifies  an  IPC  mechanism  and  orientation,  while  a  complex  structure  specifies  a  sequence 
of  structures.  We  omit  the  detailed  constraints  on  the  global  and  immutable  relation  (e.g..  child  structures 

of  a  masked  complex  strucuire  must  be  also  masked)  except  where  the  constraints  are  relevant  to  later  discussion. 
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Primitive  Policy  Elements 

lossiLi 

/•  L  :s  ••'.eqpcri-V  cr  "perra-v’-,-. "  •/ 
pcl-rN-r  / 

/•  ?  is  '’ibcica'.e",  "sjsxsv:"  cr  “aic"  •/ 

The  policy  sequences  djscussed  in  the  text  complicate  formal  proofs  without  adding  much  content,  so  we  will  limit 
formal  discussion  to  single  basic  policies,  rather  than  sequences.  Policy  elements  are  known  globally  and 
immutably. 

View  Hierarchies 

View*  hierarchies  arc  only  known  locally,  based  on  the  characteristic  view, 
ocrpne-.t  (C,  P) 

/•  C  is  *  corpone.r.  view  cf  r  •/ 

Every  view  has  exactly  one  parent  throughout  its  lifetime,  except  the  roots  of  view  hierarchies  that  never  have  a 
parent  Roots  of  view*  hierarchies  are  the  bundle  endpoints  comprising  shells,  and  their  immediate  children  are  the 
interfaces  presented  to  clients. 

Protection  and  Privilege 

ttwrkMtCJ,  D) 

/•  view  V  is  *  rmrx:  cf  oorAir.  2  •/ 

A  view  belongs  to  exactly  one  domain  throughout  its  lifetime.  It  is  rq)laced  with  a  different  view  (renamed) 
whenever  it  would  intuitively  move  between  domains.  Domain  membership  is  only  known  locally,  based  on  both 
characteristic  sets. 

oorcnDDer  0*5 

/•  V  is  icp-Ifrwl  ;.isiae  oar.iroU*r  spsov  •/ 

This  is  the  ancestor  of  the  views  where  agent  invocations  art  received,  and  HPC  system  responses  are  sent 
Primitive  Private  Peers 

botrrifVl.  V2) 

/*  v:  and  V2  oorpriM  a  a.hall  or  a  splios  */ 

The  root  of  a  view  hierarchy  has  exactly  one  private  peer  throughout  Us  lifetime.  These  pairs  are  the  formal  shells 
and  splices.  A  root  view  is  replaced  wUh  a  different  view  (renamed)  whenever  its  peer  would  be  changed  (e^^  by 
irinsfa  between  domains  oi  by  endpotnt/exiension  promocion). 

poucs  vv>(V} 

/•  V~ is  lamz  wmbt:  o!  a  s^ll  V 

The  hierarchy  is  defined  by  directed  shells.  The  views  that  lead  toward  the  root  are  distinguished. 

(This  is  an  immuuble  property  becauar  eversion  replaces  a  directed  shell  with  a  spike.  Eversion  could  safely 
preserve  view  idemifiers  by  making  roirtsjr  a  dynamk  property  provided  changes  to  the  relation  ve  rionacank. 
Eversion  sabsfies  this  condition:  false  is  never  char^  to  mie.) 
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Additional  IPC  properties 

Some  IPC  propcnics  are  known  only  locally,  through  the  view  characteristic. 

vr.r-r.  (.,  r 

/*  v:e.-  V  Tiis  s'.rjr.-:-'  r  ’ 


viAtle  fv" 

/’  vie^-  V  :s  *  orp-ex  cxccir.*.  nc*.  in  a  process  sp«ce. 

•  or  i  sirrle  enqx:  -.  a  process  space 

V 

Viable  views  are  those  that  can  be  used  for  communication.  This  is  the  only  (indirect)  reflection  of  real  processes  in 
this  formal  model. 

index  (V,  I ' 

/•  fi»ed  nrtwr  io  oeter-.i.'w  oc.-respcndini;  arpciierits  •/ 

A  small  integer  based  on  view  stnicture.  or  a  unique  number  based  on  invocations  of  new. 


A.2.2.  Derivative  Immutable  Relations 

vr.rjct  (V,  _]). 

txnclafV)  ;-vr.rjr:{V,  fauncLe.  _)). 

Bulticast  M  vsirvxr.  r\*,  (fruit  icar.,  _)). 

iiultiplex  (V)  vstnjctfV,  (fruit ipi«x,  ^)1. 

€Jt«M:cr.  rt*)  vr.ruct  (_,  extensior., 

•ftpc intro  :-vstrjctrO  i^,  andfextr.*., 
amikadCV)  vstruct(\’,  (_,  raisiuKi. 

These  predicates  simply  provide  easier  access  to  the  vstrust  relation. 

corraspondin^r.''..  V2}  mdexf.”,  I},  inoK(V2,  I). 

Correspondence  follows  directly  from  the  fixed  index. 

itat jc_^errpGrBnt  (C.  P)  cnrpcnentCC,  P),  h^tlleCP),  anoint 

st*tic_ODrrponen;  (C,  P)  oorpcna.'C  {C.  X3,  autic^ocqponarc  0(*  P) . 

Creating  a  bundle  endpoint  requires  automatic  creation  of  its  immediate  children.  These  arc  static,  as  opposed  to 
dynamic,  component  views. 

cotplwrantaryl (simple.  (M,  Cl)).  (sl;tpla.  02JJ) 

rcvaiae  (01,  CP} . 

oonplaranurydC,  S’J.  |C.  S2]i 

cxftf>lmntary(Sl7  S2) . 

<n^al«Mary((Slh  |  Sit),  (S2h  (  SPt)}  :> 
oooi^IaMnLarytSlh,  S2h),  oonpIananLarytsit,  $?t}. 

Peer  views  must  have  complfmeutary  snictiire.  For  simple  views  die  meditun  must  be  the  same,  and  the 
orienuiions  must  be  complementary.  For  complex  views,  the  component  stnjaure(s)  must  be  oomplcmefiiary. 
Multicast  and  multiplex  views  have  a  stngle  structure,  while  bundle  views  have  a  list  of  souctmes  each  of  which 
must  be  complemcnta<y. 

ifsIioaM,  V7)  :•  bonltVl.  V7).  n0C(pDirujkp(Va]  ;  polnu  \^M)) . 

MI (VI.  V7)  bfUTdfWl,  V7).  tpol«ujn>(Vai  ,  potnuT^tV^). 

One  side  of  a  shell  points  leads  toward  the  root  This  is  (he  only  formal  distinction  between  shells  a.)d  splices. 

rSooBfrdav. P.  K  :•  09vormrt.(D.  A). 

^^morrttUrg.{t.  Ai  ayyuriCfO.  BJ.  duonarcta.  K. 
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nc- (L'anpDTcr*.  f.',  _)  i  . 

rhcsc  prcQJca:cs  jusi  pro\  idc  easier  access  lo  ihe  oerpnert  relation. 

A.2.3.  Core  Dynamic  Relations 
Spaces 

*c!}*oe.n:  r*”., 

/•  v:  *.Td  W  ATR  Oirrer-.ily  view  hierarchy  roots  •/ 

It  is  Uichnicaliy  more  convenient  to  maintain  the  adjacency  of  view  hierarchy  roots,  rather  than  all  viev^s.  Spaces 
are  defined  as  the  eq>iivalence  classes  of  a  relation  over  all  views,  sAtrespeoe.  derived  from  this  core  relation.  Ail  the 
views  in  a  class  are  incidem  on  the  same  space.  Two  spaces  are  merged  by  snaking  their  view  hierarchies  all 
adjacent  to  one  another. 

Merge  inconsistencies  can  violate  the  equivalence  constraint.  These  violations  are  reconciled  automatically. 
Public  Peers 

oorYwetAdO"..  \'2) 

/•VI  A."i  V5  AT*  curi«rt.i  privAt«  pmrt  •/ 

This  is  always  a  symmetric  and  anti-reflexive  relation.  Ideally,  it  relates  distinct  pairs  of  views,  but  merge 
inconsistencies  can  violate  that  constraint  Clients  must  resolve  these  violations. 

Terminal  Policies 

XoAsjJoiicyC,  T,  P) 

/*  P  is  the  policy  tor  loss  of  control  L  wmz  dmon  D  */ 

In  the  complete  HPC  system,  p  is  a  non-empty  list  of  basic  policies  or  animations  with  parameters.  The  temporary 
list  is  terminated  with  suspend.  The  permanent  list  may  not  contain  suspend  and  is  lenninatcd  with  nbdioite.  In 
this  formal  appendu.  ?  must  be  a  single  basic  policy. 

Merge  inconsistencies  are  resolved  ttitomaticaUy.  This  formal  model  does  not  capture  the  Kquetitial 
application  of  terminal  policies. 

AJL4,  Derivative  Dynamic  Relatioas 

Spaces 

V2j  AdjAaenCi';.  V?) . 

V?)  oaifxnAnt  rvi.  pj.  CP.  V2J. 

V2)  arpenantf*^.  P).  (VI.  P). 

ejAATfv;.  V?)  ■heUrvi.  V?).  wm’tmsCJ';  Dl.  •wbrrO'S.  K. 

V2J  ba^Cr..  V2).  wtertVl.  Dl),  A««tmr(V3.  D?).  K  !•  t2. 

0|S«|j»(Vl,  V?)  ;•  V?i . 

f«rtw:{V.  haundT.*,  VI),  not|tvt»r(Vl.  01). 
tcunOA.  O'.  Dj  1;).  ). 


I 
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below  r.'.,  VI)  ac;^aer.' r.',,  V’L),  pcirisjjp(VL) , 
clear  f.'_.  VJ) ,  ad laoen*.  I V2 ,  VJ) , 
be.ow;\',  belc-T.*!,  V, ,  below (V,  W) . 

s^nc.'C.',  ::  rr.  (celou  r.',  _)),  Dcjrxiar>’(V.  D) . 

;r.;er.c: 'V,  T  r.r.  (oelcw V)),  ba,rxiar.- Dj . 


View  Suspension 

suspended  r.”  connected  (\’,  Vl),  ccmectedfv,  V2),  VI  !■  V2. 
sjsperdecCJ)  s-jper:or Tw',  D),  stperiorCVl,  D),  VI  !- V. 
suspcndedr^  t<Jundary(V,  D',  policyC,  terporary,  suspend/, 
terr  ocrtrol_loss  D) . 

ruspentkd  r."  ocrpcnert  f.',  P),  suTaendedf?) . 

sua»^  describes  forced  suspensions  due  (o  violations  of  constraints  discussed  in  the  next  Section,  or  to  temporari- 
loss  of  control.  Four  conditions  force  the  suspension  of  a  view.  It  may  luve  multiple  connections,  be  the  root  view 
of  a  surplus  superior  domain  boundary,  be  a  domain  boundary  for  a  suspended  domain,  or  be  a  child  of  a  suspended 
view.  Forcibly  suspending  one  view  may  indirectly  affect  the  liveness  of  other  views  through  other  relations. 

tenp_oar.trcl_lossC'  crrtrcl^vieu(C.  D),  liwnesstC,  suspemasd! . 

terp__C£r.trD-_lcssC  csr.ircl_^vi«w{C.  D»,  liv«n«ss(C,  oaadj, 

r«al_barxiar>'{r,  Cj,  cr^'.( (private,  C],  (public,  Bl,  J. 

oorttxt)l^viafw(y,  C) 

ttmtotriZ,  D).  oortrcller  (C) ,  oorponercC?,  C),  orpansit  0*:,  P). 

A  domain  is  suspended  when  a  tempc^ary  loss  of  control  is  detected  and  the  current  temporary  tenninal  policy  is 
suspended.  A  controller  shell  is  a  pair  of  bundles  with  one  multicast  endpoint  component  Inside  the  shell,  the 
HPC  system  creates  one  control  cndfoini  component  of  the  multicast  view  (two  levels  down  from  the  shell).  If  this 
internal  control  view  is  suspended,  or  dead  with  at  least  oi>e  chain  reaching  the  domain  boundary,  control  has  been 
lost  at  least  temporarily. 


1?3 


Private  Peers  and  Chains 

criva'.oCw''.,  V2'.  bcjrcr.',  V2)  . 

/•  Anc  \T  arc  c^rer.iy  bo-rc  pr:va-e  peers  •/  pr:va'.e{'.' ,  VU 
conone— .  r.-. ,  rl  ,  coT’oner:  ccr:vspor»i.nc r.1. 

;  ipr:va-.c,  -I,.  iv:.':va*.<c  J).  /•  V”.  A'>i art 

ccrresporvu'ic  ccrtoner.is  cf  pr  .i:  jeers 

CTdir { (priva:e,  V.',  (private,  ‘.'2],  []])  private {\'l,  V2) . 

<±ar:( (ncblic.  V'l.,  ipi±)lic,  V2'..  (]))  oomectodCV’l,  V2) , 

/•  \'l  arc  Vi  Are  bajnd  ir  one  step,  privately  or  publicaliy  •/ 

c±iain{ (private,  VI],  T,  ((public,  V2j  I  R]]) 
private  r.',  V2),  crai-''.(  fpuclic,  V?],  Z.  R) , 

/•  A  private  one  r,ep  bi.ncL.ng  fror  VI  to 

•  e)certis  cr^irs  r..arti.ng  with  a  piilic  bi.ndinQ  cf  V2 

• 

cra.»n{ (public,  \’i;,  T,  <  [private,  V2]  j  Pll) 
con.nectqsr.l.  V'2;,  cnair.C  (private,  V2].  T,  R) , 

/•  A  public  one  step  bi-ndi-nc  frcr  V'l  to  V2 
•  ejcends  emirs  starting  wit.n  a  private  bindLng  of  V2 


Private  peers  and  chains  are  defined  recursively  in  terms  of  corresponding  components  and  direct  private  and  public 
bindings.  The  definiuon  of  private  peers  given  here  is  the  looser  one  used  in  the  path  maintenance  algorithm. 

(A  serious  PROLCX}  implementation  would  require  a  check  for  cycles  in  the  last  two  clauses  of  <±mn/Vf 
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Liveness 

liveness  (V,  susperoed^  saspenoed TvT  ,  !. 
liveness (V,  alivei 

private  fi’,  V2] ,  viaile  ('.T ) ,  net  (s-aspenooe  ' ) ,  ' . 

/*  ii; 

•  V  \’2 

V 

liveness  (V,  alive' 

chain  ( I  private,  Vj,  (public,  V3],  ((public,  V2]}), 
liveness(V3,  alive),  !. 

/•  in—in 

*  V  V2  V3 

V 

liveness  suspended) 

private  fV’,  V2),  viable  (i^),  suspended(\l2) ,  !. 
li\«iess(V,  suspended) 

chain ( (private,  V],  ipublic,  V3],  ((public,  V2]]), 
liveness  (\’2,  suspended: ,  . 

liveness (V,  dead). 

A  view  can  be  forced  ud  suspended  liveness.  Otherwise,  we  look  down  chains  starting  with  a  private  binding,  first 
for  alive  views,  then  for  suspended  ones.  If  no  one  step  private  bindings  lead  to  viable  endpoints,  then  liveness  is 
inherited  firom  the  next  step  down  the  chain,  if  no  viable  peers  are  found,  either  alive  or  suspended,  the  view  is 
dead.  (Cycles  must  be  avoided  in  the  third  and  fifth  clauses  given  here.) 

A3.  Legal  Structures 

The  core  relations  provide  a  struciural  framework  without  any  definition  of  legal  smicture.  The  pennissible 
HPC  structures  form  a  very  small  pan  of  the  possible  ones.  Treated  as  a  formal  model,  the  core  relations  give  an 
alphabet  of  symbols,  with  no  axioms  constraining  their  interpretations.  We  formally  define  the  legal  HPC  structures 
as  diose  interpretations  satisfying  the  axioms  giv;;n  in  this  section.  These  axioms  are  applied  to  the  structure  known 
in  the  current  partition,  that  is,  the  local  characteristic  relations.  The  renaming  technique  violates  many  axioms 
wheri  the  formally  complete  (over  all  lime  and  partitions)  global  smK:ture  is  considered. 

It  will  be  our  goal  in  the  next  few  sections  to  show  that  legal  structures  are  closed  under  the  HPC  primitive 
operations  (soundness),  and  that  any  legal  structure  (up  to  isomorphism)  can  be  created  by  the  primitives 
(completeness).  However,  legal  structures  are  not  closed  under  merge  inconsistencies.  After  the  HPC  system 
applies  its  automatic  reconcilition  of  merge  inconsistencies,  two  axioms  concerning  mutable  relations  can  remain 
violated.  In  these  cases,  legal  smicuire  must  be  restored  by  application  managers.  We  will  make  no  attempt  to 
capture  formally  the  broader  sense  of  consistent  behavior  enforced  over  a  merge  inconsistency  by  suspended 
Uveness. 

AJ.l.  Notitional  Conventions 

As  in  the  proof  of  the  characteristic  theorem,  it  is  convenient  lo  treat  relations  as  predicates,  as  sets,  and  as 
functions.  To  use  a  relation  as  a  predicate,  we  supply  a  single  value  for  each  domain  of  the  relation  in  parenthesis. 


orpanatt  (v.  p) 


//  u\je  iff  V  is  a  orponm;  of  p 
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When  treaiing  relations  as  sets  of  tuples,  we  use  conventional  brace  ({si,  s2))  and  angle  bracket  (<ii,  t2>) 
notation.  Modification  of  structure  is  expressed  using  C-Iike  notation  for  set  addition  and  set  subtraction.  E.g., 

comeclec  ■»•=  {  <vl,  v2>,  <v2,  vl>  } 
viable  -=  i  vl  ) 

To  treat  relations  as  (partial)  functions,  we  want  to  provide  a  value  for  some  domain(s)  of  the  relation,  and  get 
back  the  values  of  the  other  domains.  For  example,  we  would  like  to  know  what  views  are  children  of  the  given 
view  V.  By  providing  a  set  of  input  values,  we  obtain  the  image  of  the  set  under  the  relation.  We  introduce  a  quoted 
number  convention  to  indicate  which  domain(s)  of  the  relation  is  (are)  to  cast  the  image. 

oatponert'2' {v}  //  sei  of  childr^er.  of  v 

cerponent'r  {v}  //  set  of  pare.'ii.s  of  v 

lossjX)licy'l,2'  {<d,  1>)  //  policies  that  apply  to  cior^  d  trrier  loss  1 

We  systematically  confuse  a  singleton  set  wj’*-  its  member. 

Vertical  bars  denote  the  cardinality  of  a  set. 

M  )  I  -  C 
I  {  <vl,  v2>  }  I  —  1 

When  used  as  predicates,  an  empty  set  denotes  falsehood. 

Iteration  over  the  members  of  a  set  uses  this  notation. 

for  s  in  {si,  s2,  s3) 

//  boci>-  of  iteratic.-. 

Bundle  structures  specify  a  sequence  of  child  structures.  We  will  need  to  extract  this  sequence  from  the 
bundle  and  refer  to  its  individual  elements.  Given  a  structure  s,  we  use  this  notation; 

s.Gorrpcnen^s  ! !  soquenae  part  of  stiuccure 

Is.oorponentsl  //  ruroer  of  elerHr.ts 

s.oaTpcnertsIi}  //  i-th  elarent.  of  sequenae 


A3.2.  New  Dement 

Structure  modification  formally  involves  structure  that  always  existed  and  simply  wasn’t  known  in  the  current 
partition.  By  changing  characteristic  sets  we  change  the  known  structure.  Actually,  we  rename  arid  generate  new 
structure  on  the  fly.  We  use  the  assertion  nw(e)  to  indicate  a  view  or  domain  that  has  never  been  known,  and  avoid 
axioms  that  describe  structure  before  its  generation.  The  rigor  required  to  formalize  this  meta-axiom  is 
unrewarding. 

A3J.  Immutable  Relations 

dcmlntd)  //  chAnctorlstic 

vimtiv)  //  <h*tacf  rit.lc 

The  local  characteristic  relations  all  rely  on  these  characteristic  sets,  which  no  internal  constraints. 
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cxr.trclier(v) 

=> 

vdewCv) 

//  characteristic 

controller (v) 

=> 

•.table  (V) 

if  liveness 

oontrc'iler  (v) 

=> 

poi.nts_up(v) 

//  root  of  spaoe 

cuitrolier  tv; 

=> 

adnaoer.t' 1' V  — 

//  leaf  spaoe 

cor.rrcller ‘'v) 

=> 

cemeoted'  1 '  (desoi’ica.nt '  2'  v) 

>“  {  )  il  er-y 

ror.trol.er  (v) 

=> 

avster,  (rncmcer '  1 '  Vi 

//  protected 

cor.trv'l. '  r  i\’i 

-> 

\3trcr- {V,  !oc>dle,  e-x^irt, 

,  //  structure 

Iruit  least,  enc^int, 

Isinple,  erdpoisii,  (oontrcl, 

A  controller  view  must  be  the  inside  boundary  of  an  empty  shell  with  a  specific  structure.  They  are  protected  from 
application  domains. 


oa^penent  (v.  p) 
coTponer.t  (v,  p) 
crs’poner.t  (v,  p) 
axipcner.t  |v,  p) 
SQTVoner.t  (v,  p) 
carporvent  iv,  p) 
bundle  (p) 
rulticast  (p‘ 
rTultiplex(p) 
oorpenent  (v,  p) 
cxjrponer.t  (v,  p) 


«>  view{v) 

>f>  ■t'iewip} 

«>  ocrpiex<pJ 
«>  enc^int  (p) 

•=>  viable  (p) 

*>  vstruct'  I'v  — 
*>  vstrjct'l'v  — 
«>  vsr.rjct' I'v  » 
*>  dcralt'l'v  — 
«>  corpcner.t' I'v 


//  characteristic 
//  characteristic 
//  parent  structure 
//  pare'--  structure 
//  parent  liveness 
//  cerron  structure 
>  vstruct' I'p.ocrponents (index' 2 'v; 
=  vstruct' I'p.ocrponentsd' 

‘  vstruct'  I'p.ccrtporants  [1] 
ooreLn'l'p  //  corron  doradr. 

—  (p)  //  unique  rrap 


The  parent  of  a  component  view  must  be  a  viable  complex  endpoint.  The  structure  of  the  component  must  be  one  of 
the  structures  specified  by  the  parent. 


n»r43er{v,  d)  ->  vdewlv) 

trgRter  {v,  d)  *»>  dcraanW; 

I  itwber'l'v  |  ■■  1 


//  characteristic 
//  characteristic 
//  inique,  ocsnplete  map 


Fvcr>'  view  belongs  to  exactly  one  domain. 


bound (vl,  v2) 

view(vl) 

//  characteristic 

bound  {vl,  v2) 

•>  view(v2; 

//  characteristic 

bound  (vl,  v2) 

•>  oerpierent ary  (vstruct' I'vl, 

vstruct' l'v2) 

bound^vl.  v2) 

->  bundle  (vl) 

//  view  structure 

boLSTd(vl,  v2) 

->  endpoi-Tt  (vl) 

,'/  view  structure 

bound (vl,  v2) 

■>  boMTd(v2,  vl) 

//  sjwTtetric 

ba'rtilvl,  v2) 

i 

A 

//  erti-teflexive 

bound (vl,  v2) 

•>  band'l'vl  ■—  (v2* 

//  mique  map 

bound  (vl,  v2) 

*•>  toplevel(vl) 

//  orJy  vh  roots 

Shells  and  splices  must  be  distinct  toplevel  bundles  with  complementary  stnjcuvcs.  HPC  also  requires  i  unique 
private  binding,  although  this  constraint  could  be  relaxed. 

poirts_up(v)  ->  vi«w(v)  //  eharacteristic 

polrts  uptv)  ■>  tcplcM^Kv)  //  cu\y  vh  toots 

Only  toplevel  views  are  pan  of  the  hierarchical  relation. 


//  characteristic 
//  ^laraoeristic 
//  Oi  Masked  views 
//  vniqje,  ooiplete  map 

Every  view  has  exactly  one  .sourture. 


vstruct  (V,  s)  ■>  view{v) 

vxtuct  (v,  s)  •>  structure(s) 

vitAKXtv,  a)  ">  !Maaked(v) 

I  vstruct.' I'v  1  1 


viable  (v)  ->  vlew(v)  //  dharecteristlc 

viable  (v)  ->  U3pleN«l(v)  //  cnly  roots 

Only  luplevci  views  are  pan  of  the  viability  relation. 
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inoey. v' 


//  characterise  i'- 


inQ2>:(:,  v; 

>  view(v) 

//  dvaracterist: 

1  inQex'2'v  |  — 

- 

//  miqje,  cerr  ce  rrap 

Every  view  has  exactly  one  index. 

A3.4.  Mutable  Relations 

ad ^aoerit  (vl,  v2) 

«>  view(vl) 

//  dvaracteristic 

adjacent  (vl,  v2.' 

*>  view(v2) 

//  characteristic  ^ 

ad^aoent (vl,  v2) 

=>  ad jaae.nt.  (v2 ,  vl) 

//  synretric 

adjaoe-.t  (vl,  v2) 

<=  ad3aaent(vl,  v3) 

4  adjacent (v3,  v2)//  transitive 

adjacae.'it  (vl,  v2) 

«>  dxain'l'vl  **  dcrrain'l'v2  //  corron  darair. 

adjaae.nt  (v,  v) 

<=  toplevel  (v) 

//  all  vh  roots 

adjaaer.t  (vl,  v2: 

->  toplevel  (vl) 

! !  only  vh  roots 

Only  toplevel  views  are  part  of  the  adjacency  relation,  an  equivalence  relation.  Each  equivalence  class  defines  a 
space  and  must  belong  to  a  common  domain. 


connected  (vl,  v2) 
can.Tected(vl,  vC' 
caaTectec(vi,  v2; 
oomected  (vl,  v2) 
comected  (vl,  v2) 
connected  (vl,  v2) 
connected  (vl,  v2) 
connected  (vl,  v2) 
oQnnBcted(vl,  v2) 


«>  view(vl) 

-■>  view(v2) 

■>  siie^jacelvl,  v2) 

■>  exte-nsion  (vl) 

->  exte.'ision(v2) 

«>  ccrplerTEntarylvstruct' r 
■>  ccre»cted(v2,  vl) 

->  vl  !-  v2 

->  connected' I'vl  «  (v2} 


//  characteristic 
//  characteristic 
//  sane  space 
//  structure 
//  structure 
vl,  vstruct'rv2) 

//  reflexive 
//  anti-reflexive 
I#  unique  rap 


Connected  views  must  be  distinct  complementary  extensions  in  the  same  space.  HPC  allows  at  most  one  connection 
per  view,  but  merge  inconsistencies  can  lead  to  multiple  connections. 


lcssjxlicy(d,  1,  d£nai.n(d.'  //  characteristic 

lossj»licy(i.  1,  p;  ->  icasd)  //  characteristic 

loss jx)l icy (d,  1,  p)  ->  policy (p)  //  characteristic 

lossjX)licy(d.  pemunent,  p)  ■>  p  !-  jsu^aend  //  trust  reacK«r  cxxitrol 

dorrain(d)  4(  !syster  (d)  >>  //  mique,  oorplete  map 

I  iessjolic%''l,2'cd.  1>  |  —  1 

Each  non-system  domain  must  have  exactly  one  policy  for  each  type  of  loss  of  control.  HPC  refuses  permanent 
responsibility  for  a  domain,  in  this  presentation,  only  basic  policies  are  allowed. 


AJ.5.  Hierarchical  Constraints 

The  axioms  which  enforce  the  appearance  of  a  hierarchy  are  definitely  the  most  complex  as  well  as  the 
hardest  to  handle  when  showing  soundness. 

d  -•  rcot  •>  I  supuior'2'd  |  —  0  //  root 

d  !•  root  ">  I  supetior'2'd  |  1  •#  unique  stperioz 

The  overall  root  space  has  no  upper  pointing  views,  while  all  other  legal  spaces  have  exactly  one  view  hierarchy 

that  points  toward  the  root.  However,  n'lerge  iiv'^^tsistencies  can  prodxe  multiple  superior  intcHaces  at  the  overall 

root  at  the  root  of  a  non-system  domain. 


belo<(vl.  v2)  «>  !bBlow(v2,  vl)  //  entl-eyenetnc 

bel»(vl,  v2)  ->  !Mmeqpeae(vl,  v2)  //  anti-reflfixive 

btla^tvl,  v2)  44  below  (vl.  v3)  //  hierarctiy 

•>  MPMpece{v2.  v3)  |i  below(v3.  v2}  |!  belaw(v2,  v3} 

nertber*  J'vl  mertjer'  1*  v2  //  oontiquous 

•>  %Kr*3ptceivl,  v2)  ii  beiowCvl,  v2)  ||  beioM(v2.  vl) 

All  spaces  of  a  domain  must  be  contiguous,  and  they  must  be  organized  into  i  tree. 
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•j'.iezicriv,  root) 

=>  vstrjct' 1' (asjacser.l' 1’ V)  =  [andlc,  endpoir.i,  i') 

s-ysterld)  ii  c  !=  rod  =>  t  inferior' 2'c  |  =  C 
'.sys'.or.iz'  -> 

t  •:  c  ;  c  IT.  o^aqje' 1' {irierior'2'd)  ii  ccr.trcller  (c)  }  I  *=  1 

Inferior  domains  of  the  root  domain  are  opaque  top  level  applications  with  no  interfaces.  Process  and  controller 
domains  have  no  inferior  domains.  Non-system  domains  must  have  a  unique,  immediately  adjacent,  controller 
domain. 

A  J.6.  Constraints  on  View  Structure 

strucourr (s)  Us*  ini;lticas*.,  ...  seq}  ->  |aec|  1 
stinxrrurels)  s  —  (mltiplex.  ...  seq]  ■>  Iseql  -»  1 

Exactly  one  structure  must  be  specified  for  dynamically  created  views. 

A.4.  Operations  on  Structure 

If  the  core  relations  define  a  domain  of  structural  models,  the  constraints  on  legal  structures  are  axioms,  and  a 
specific  structure  is  a  formal  sentence,  then  opeiadons  on  structure  are  formal  rules  of  inference  on  sentences. 

We  present  HPC  (^rations  in  three  stages.  First,  we  define  some  auxiliary  operations  that  are  not  sound 
when  used  by  themselves.  Using  these  definitions,  we  present  the  core  structural  operations,  which  are  soimd  when 
invoked  with  the  appropriate  preconditions.  Finally,  we  reduce  the  operations  available  to  HPC  clients  into  core 
operations. 
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A.4.1.  Auxiliaries 


\  ie\v  Hierarchies 

\r._creat.e  (r.,  c,  p,  s,  v; 

let: 

pnscondit  io.’ts : 
effects: 

dorain  //  rc  manoe 

syster.  //  no  cnanot 

ocntrolier  //  nc  cr^noc 

view  ■»«  {  V  > 

ccrpnen:  if  CerrtvMp))  the'  {  <v,  p>  ) 

frerber  {  <v,  d>  < 

bounc  //  no  chan?? 

point  5_up  //  nc  cnaro? 

vstruct  {  <v,  s>  ■- 

viable  if  Gart)lcx(v)  ((  en^int{v)  cher.  {  v  ) 

index  —  if  rult:plex(p)  ther.  <  <r.,  v>  )  else  {  <i,  v>  ) 

•djaoent  //  nc  chan?? 

oennectec  //  nc  cna-T?? 

ios5_polic\-  /<•'  nr  cr^rtx 

iflandle{v)  n  endpoint(v) 
for  ci  ir.  {  1,  ...  Is.ocrpcntTtsI  ) 
new{c) 

vh__cre*te {r.,  ci,  c,  v,  s.axponentsicii,  c) 

vh_destrDy(i,  d,  p,  s,  v) 
let: 

Fx:  —  point  sjL:plv) 

VI  —  Viable  (V) 

preoonditions: 

effects: 

for  c.-  in  ocrponent'r' V 
let  Ci  —  rooi'l'r.' 
let  *:  —  vstrjct'l'c.’ 

\^_dBStrt>>Mci,  d,  v,  s;.  cr.'; 

domain  //  no  charoe 

ay3rt«n  //  no  change 

controller  //  no  chan^ 

view  —  {  V  • 

oonpawtt  —  if  (!srp;y(p}>  then  <  <v,  p>  ) 

witia:'  —  {  <v,  d>  J 

bound  //  no  charge 

poinu^qp  //  no  change 

vatruct  -•  (  <v.  »>  ) 

viable  "•if  v".  ther  '  v  * 

index  —  (  <v.  t>  J 

ad3acart  //  no  change 

czmected  //  no  <±arap 

loesjpcUcy  //  no  cTwngr 


Renaming 

vr.  rena-e(do,  dr.,  vc,  \r.) 


lei: 

pc  =  cxTcone-.t'l'vc 
pn  *=  ccnponer.t' i'\r. 


s  “  structure'!' VO 
i  index' I've 

cv  =  coYiected'l'vo 


preoonditions : 
effects: 

dOTUUi: 

lysten 

controller 

bound 

pointsjjp 

viable 

adjacent 

lossjjolici* 

view 

coTfxnent 

iiBiber 

vstruct 

index 

ocnnHCtad 


//  no  change 
//  nc  chanse 
//  nc  change 
//  nc  change 
//  no  change 
//  no  orange 

//  nc  change 
//  no  change 

{  vr.  ) 

+*  if  (lerptyipr.)) 

i  <vn,  dn>  } 

4-  j  <vn,  s>  } 

+•  <  vr.,  i  > 

^  (cv  X  (vn)) 


then.  (  <vr„  pr>  ) 


for  CD  In  c 
new(cn) 

vh-rerarTr(cc,  dn,  cz,  cr,' 


vleu 

odtpenert. 

CTBfffaer 

vstruct 

Index 


—  {  VD  ) 

—  if  {»«Tty{po))  then  {  <vd,  po>  ) 

—  (  <vc,  dc»>  } 

—  (  <vt,  s>  } 

—  <  VC,  i  > 


connected 


(cv’  X  (vc)) 
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sr.  rerare  (ao-- ,  czl.  cri.  -c,  oz,  zr ,  br.- 


ie*. : 
v:l  »= 

v_2  =  v'iable(»: 
p-l  —  pcir.isjjr 'VC 
J3LL?  ■■  psinvsjuT-irr ; 

a!  «  acTaoer.v'l'vc 
a2  —  ad^aoert' I'ljc 

preoonditions: 
bond (10,  be) 


effects; 

ckrair. 

n 

nc 

direct  c'-aro:- 

syster 

// 

nc 

direct  cra-Toe 

cent roller 

n 

re 

direct  craro? 

view 

t ' 

rc 

direct  drange 

coTpenert 

// 

nc 

direct  G-anoe 

vstrjcv 

nc 

c-rect  CTAra- 

iroex 

// 

nc 

d-rect  cranoe 

lOSSJXliCN- 

// 

nc 

direct  ca-w 

faoend 

— 

ct:,  be  .  <rc.  to  • 

points^yc 

■«» 

zi 

tne-  tc 

.points_vjp 

— 

,  f 

pi2  the-,  be 

viable 

-* 

if 

v:.l  then  {to) 

viable 

— 

.  4 

%-.2  t:>cr.  {bc) 

adjaaer.; 

if 

!erTit.y{al)  (al  X  {to}) 

-  ({to}  X  al)  *  {<to,  coj 

adjaoent 

— 

■il  4 

!erpty(a2)  (a2  X  {be)) 

♦  {{bo)  X  a2)  *  <<»,  bcj>) 

t^^renar*  (del, 

tc,  t-' 

vr__rBna.Te  {fic2, 

,  <± 

be,  Lt  ' 

bcxnd 

♦« 

{  * 

''tn,  cr.>,  <hr,  tr>  ) 

points^vf. 

•¥m, 

iS. 

fill  tner.  tr. 

points  jjF 

** 

i  f 

tMer.  far. 

viable 

■»« 

i.  e- 

xti  tMft.-.  {tr.. 

viable 

i  € 

•.-,2  tnet  irr.- 

adjaotr: 

-  < 

criLtyial)  \«1  X  ''tr;' 

'  ((tr.)  X  al)  •  i<tr.,  tn>? 

ad^oent 

;«rfxylA2}  (a2'  X  {Dr;) 

*  {{in/  X  42/  ♦  bt>) 

<io_twyr(»{£ij,  <r;,  to.  fcc,  hr.) 

Ik: 

pr9cardiwiar.s: 

iflt  db  n«Bt»r'l'bc 

ih_i«nai»»  {dc,  «fo.  to.  be.  tr,.  far:) 

If  claarttc.  be:* 
for  cot  ib  ad.jaosnt'  -  ihx-? 

newCent) 

n*w(»C4 

let  cob  ••  bound' I'ccr. 

ac_anif!*{dc.  <?v,  orx,  taij.  ctx,  enfaj 


A,4^  Core  Operations 
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Connections 


c_creare(vl,  v 

ICw : 

cr'eaondi^  isrjs : 

effects: 

dorair. 

// 

rc 

cr.arDe 

syrter 

// 

rc 

cnangc 

ocntroiier 

// 

nc 

cr.a.'o: 

view 

// 

no 

crange 

oorponent 

// 

rij 

cnangc 

figiLer 

// 

rc 

cr^no^ 

bocnc 

// 

rc 

cnaroo 

pcLnts_i?: 

It 

rc 

CT-ang-: 

vstrur. 

It 

no 

cringe 

viable 

n 

nc 

er^nge 

index 

n 

no  change 

adjacent 

h 

rc 

cnange 

ocmectec 

♦- 

i  <vl,  v2>, 

lossjxlicy- 

// 

nc 

crangc 

cjdestroy  (vl. 

v2) 

let; 

pmmditians: 

effects: 

doTttin 

// 

nc 

change 

*y»ten 

// 

rc 

change 

oonttcller 

// 

rc 

change 

view 

// 

rc 

CTAnge 

cjupnent 

// 

rc 

cran?: 

Ranter 

// 

rc 

change 

bond 

// 

no 

change 

poimsj^ 

// 

rc 

changt 

vstruct 

// 

rc 

crangp 

viable 

// 

nc 

oanc^ 

index 

// 

rc 

Grange 

ad>>oer.t 

// 

rc 

crvange 

ocmectfld 

-• 

(  <vl,  v*>. 

loM_polic‘y 

// 

no 

change 

J 
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ie. : 

vs  oesoencar.*.'' r '  (a::;a3E?". '  1' v 


precsncL-ic". : 

effects: 

derate 

!, 

!X 

canv 

syste- 

II 

rc 

cran?- 

cxantrcller 

n 

rr 

crano- 

vie« 

// 

nc 

craT>. 

oarparc''.t 

// 

rc 

craw 

mErce.' 

// 

nc 

danoe 

bound 

// 

nc 

cnanop 

po:r.ts_tr 

/' 

nc 

crano- 

vstr^r. 

/ 

tv 

cran? 

viable 

h 

:c 

crano 

iroex 

II 

'C 

craT>-‘ 

ad3aaer.t 

//  nc 

chanpe 

csnnecto:: 

— 

(vs  X  V5 

loSSJX.-CN- 

/ 

nc 

cran>= 
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Shells 

s  srl:'.  (d,  vj..  sj.,  vs^,  v.,  s.,  vs'.) 
le-. : 

preconcLtior'i; 

ad;'a3Br*.' 1' vs^  =  vr-  -  vi! 
od^aaeni' 1' vsl  »=  vrj  -  vsl 

jiev(r.) 


effects: 


vh_crcate  (n. 

C,  d,  \  sl,  vl) 

vh_crcate  (r.. 

C,  c,  •  yj,  vj) 

doTciar 

//  nc  cr-,a.'t>’ 

sysicr 

//  nc  crarot 

cent roller 

//  nc  cnaroe 

view 

//  j-c  direci  chanae 

oorpenent 

//  nc  direcr  cf^no? 

fTwfeer 

//  nc  cireci  change 

bcxrc 

t  <v_,  vl>,  <vl,  v^>  } 

pcintsyp 

—  ^  V-  • 

XTTtrJOt 

//  nc  direc".  change 

viable 

—  {  v_,  vl  :■ 

moex 

//  nc  c-reci  change 

adjaoBTX 

(vyj  X  Ivc))  ♦  (|vj)  X  vau) 

4  I  <vu,  vu>  ) 

ad3*aeni 

—  (vsl  X  (vD)  ♦  (M)  X  vsl) 

•»  (  <V'l,  vl>  ) 

adjaoK^i 

—  (vy-  X  vsl)  4  (  vsl  X  vsu) 

aameciec 

/  ■'  nc  cr,in3f 

lo5S_pclicy 

//  nc  CTAnge 

5_merge(d,  vj,  yj.  vs-,  vl,  si,  vsl) 
let: 


preoandiiions: 


effects: 

dx«i' 

t .  n:  crAT>“ 

yysier 

//  »v:  CTJnge 

oonirolier 

/  /  nc  r'Aige 

view 

//  nc  dirvr.  change 

cerponertt. 

//  no  direci 

rartier 

//  nc  d.nBci  c^«ge 

bond 

-•  (  <vj,  vl>,  <vl,  v,;> 

) 

poini8__ip 

“«■  (  VJ  ) 

vairud 

//  nc  cineci  change 

viable 

-  (  V-,  vl  ) 

index 

//no  liraci 

ad^aaont 

-•  (vau  X  (w))  ♦  {(w)  X  vaj) 

ad>3Pn'. 

—  (vsl  X  (vl))  •  f(vl) 

X  vsl) 

ad3aaBni 

♦-  (vy/  X  vsl)  ♦  {  vsl 

X  vau) 

ocmocied 

//  no 

laas_pDUcy 

//  no  change 

>^jtkatroy{C. 

d.  (  ),  a-,  vj) 

%^_^<kB?.roy(C, 

d.  (  ).  sl.  vl) 

(  <vu  vu>  } 
(  <vl,  vl>  ) 


Views 
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V  cTcatoic-,  r.,  i 


preoorni-tic-r : 
effec-.s: 

vr._cneate (r,  1,  c.,  rl,  s..  cl' • 

for  pr  ir,  privace' I'rl 
Ic:  dr  —«  neriKr'l'r: 
lec  sr  ■=  vscrjcr. '  I'rr 
if  viable (pr) 
new(cr) 

\r._crea*.e(r.,  1,  cr,  pr,  sr .ccrcorencs  [1  ] ,  cr) 
v_6eslj:n>M,  s,  p, 


preoondit  lociS : 
eff«xr.s: 

vh_descrcy!i,  c  r..  s,  c 


Processes 

p_crBate(sL,  cl,  roo,  ro.,  rr.-,  rr,!) 


pjBOondiiiors; 


ffec-.s: 

oo_r«nare  (s-. 

c., 

,  n., 

rr.-,  : 

donaLn 

{  cl  . 

•yK«- 

i  C.  ; 

oortroUer 

// 

nc  crAt?? 

view 

r 

rt  c.rer- 

cTArof 

orpaner. 

// 

nc  c-rer. 

cranof 

pvrber 

/ 

re  c.rtrr. 

barv 

It 

nc  direc". 

cfvtnae 

pOintS^Lp 

n 

nc  c.rer. 

crarni^e 

vacruK 

// 

nc  Qirer. 

change 

viable 

— 

(  ^ 

Inckx 

// 

nc  irer. 

(range 

ad>acr.*. 

f! 

nr  ecrer. 

crange 

cxmactec 

/' 

nc  c-rer. 

tranpe 

loasjJoUcy 

// 

no 
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p  oes'.rcvir^,  ci-,  ra-,  r:.,  rr.l) 

lei : 

efiecis: 


do  rer^Tj  (c. 

Q^srair 

'  cL  ' 

sysier 

-- 

•  =1  f 

ccrirclier 

/  ^ 

rc  Gianoe 

view 

// 

nc  d-rec*. 

cranae 

carpnent. 

// 

rc  direct 

cringe 

merrber 

// 

nc  aired 

cringe 

bamc 

// 

rc  direct 

cringe 

poir.i5_u: 

// 

nc  cirect 

c-i"ge 

vstrjr. 

/ 

nc  d.rect 

C'inge 

viable 

— 

■  rri  ) 

iraex 

// 

nc  aized 

cringe 

adjaae.'i 

// 

nr  direct 

CTiW 

oarvneciec 

/■ 

nc  direct 

cinge 

lossjxlic,- 

// 

nc  ctaroe 
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Domains 

c_spl:*.  ;c‘.,  ::x,  v~-.  cc*..  ccr' 

Ic; : 

preoD-vi::  io-z : 
new  (or  > 
new(cr. 

new(cri: ) 
new{c-r' 

efleCLs; 

OCTdir  •;  C%  QC 

sys'.er  •  gc  ' 

oor.trclle:  — ■  •  cat  • 

v:ew  //  rc  cirec*.  cranoc 

ocrpone''*  //  nc  cLrer.  cnirt* 

trwxe:  nc  c.rt^.  cringe 

bound  //  nc  c:rec:  cn^nge 

prin:s_up  //  nc  direc*.  orange 

vsirjct  //  nc  c.rer.  cnvigt* 

v:-»ri«:  —  •  car  • 

index  //no  cir«r.  d'Ange 

*d;*acv.  I'  nc  cire^.  dange 

covieciec  /  nc  direc*.  cringe 

lossjxilic/  {<cr..  terpcr4ry,  sujpend>,  <cb.,  permneri.  jJDdiCAte> 

cb_nBnart! (G,  or.  cot,  col.  c-z.  cnb) 
do_rerare (d,  cr..  rc*.,  roc,  rro,  rob) 

oijt«rg#(d.  roi,  roc,  roi,  roc,  oot,  cob) 

ie.: 

F  -pc:ir/:'c 
cr.  —  fwwr’  I'  ro: 
dc  •••  rentier' 1' car 

precrrditions: 

newc^-.-J 

r*w(crb' 

effoc*.  s: 


cto^rora.T»  (cr. 

C. 

ert,  cct,  CTO,  erfc) 

Qc^rerjr*  (dr.. 

c. 

m.  roc.  rot.  ere) 

dtrolr. 

i  <r^  d.  » 

eyscer 

— 

'  dr  ) 

controller 

-• 

<  OCC  ; 

view 

// 

ID  cirec;  cnenge 

// 

rc  direct 

tmntKi 

// 

ID  direct  tfvmge 

bpzti 

//  nc  Oirocr  ct»ng» 

poirtisjjp 

// 

no  dUDcs.  denge 

vatrucx 

// 

ro  direct  ciengv 

vubk* 

•:  cab  i 

index 

// 

rc  direct  ctwvje 

rndyuMTS. 

// 

no  direct  ctenge 

oomtciec: 

ll  no  direct  cnengc 

Umi._policy 

— 

tidi  X  pj 
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Structure  Hierarchies 

t  kill  (3-,  vj,  v:,  nl) 


sj  =  vs^  rjr. '  ■. '  '/- 

di  =  nerber'  V  vl 
si  »=  vstrjct'l'vl 

preccnditicr^; 

.'controller  (vl) 

effects: 

if  splice{vj,  vl' 
new(xl) 
new’(>cj\ 

S_ciBstrcy (vj,  vl,  rj,  nl,  xl,  xj} 

if  inferior  (vu,  U  syster.tdl) 
p_destroy(d;,  dl,  vj,  vl,  ns,  nl) 

if  irieriorivj,  d:)  i(  !s\v>.er(dl) 
new(xj) 
neu{^l) 

6  nergeidj,  vj,  vi,  xj,  xl,  c,;,  cl) 
t“)cll(dL;,  xt;,  xl,  r..,,  rd) 

if  clear (vu,  vl) 
c_cleir(vl) 

for  cu  in  adjaoer.t'l'vl  -  {vl) 
let  cl  “  bcjnd'l'cL: 
let  seu  —  vstract'l'a; 
let  scl  -■  vstruct'l'cl 
c_clear  (cu) 

t_kill(dl,  cu,  cl,  n.;,  nl) 
s_rttngpe(cL.  nj,  sea,  ac^aoent' I'na  -  (nu), 
n.,  sc.,  adj*aent'l'nl  -  (nl)) 
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A.4.3.  HPC  Primitives 
Connections 

ccnnec'.  (a,  vl,  'vv  ■ 
ie‘, : 

s!  •=  vszrjjrJi'vl 
s2  =  vstrjci' 1' vr 

preoDnditions: 


viewtvl; 

view(v2) 

type 

mEjTrer{vl,  c) 
rnaTb2r(v2,  d; 

privilege 

sanespaae(vl,  v2; 

local  oorposition 

ejfl:e"aiar.  (si ) 
extcnsicr,  (s2 ) 

view  structure 

ccrciene-.t^r\’(sl,  s2) 

connected’ I'vl  «=  •  > 
connecter.' 1' v2  =  {  > 

not  connected 

effects: 
c_crBate(vl,  v2) 

disconnect  (a,  vl,  v2'' 

let: 

si  —  vstruct'l'vl 
s2  —  vstrjct'  l'v2 

prBOonditions: 
view(vl) 
view  |v2 ) 

type 

nerberivl,  d) 
wrber  (v2.  d) 

privilege 

oav»cted(vl,  v2j 

rvxuadly  oomected 

effects: 

cjder.roytvl,  v2) 


L 
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Shells 

enclcsotc,  vsl,  s-,  s'. ' 
lc\ : 

v:ew.s  az:ac""r. ' '.'x-e'. 
vsT  ==  VI ev*-::  -  vs'. 

v.*:;  ==  .;f  (po^nts_it:' 1' VTil)  tne.’-,  V"1  eisa  vs2 
vsl  ==  if  (pcir.ts  ijp'l'vsl)  tlier.  vs2  else  vsl 

preconditio'-.s: 

for  V  nerber  vsl  arorjreni  type 

view  (v) 

strjcture  {sj! 
structure  (si) 

for  V  i.",  vsl  privl'iege 

nerrbsr{v,  i) 

oaTpleTEntary(sj,  si)  view  structure 

lerpty (vs-j’  partition  of  sraoe 

for  vl  i;-;  vsl 
for  v2  IT.  vrsl 
ad;}aoer.t  (vl,  v2) 

cornectod'l' (desoercar.t'2'vs-j)  subset  vsu  carpos  it  ion  unaffected 

ojnnficted' 1' (desoendent'2'vsl)  subset  vsl 

newC.vj) 

new(v:) 

effects: 

i__s^5lit(d,  vu,  su  vsu,  vl,  si,  vsl) 

disclose  (d.  v-,  vl) 
let: 

vsu  —  adjacent' 1' (vj}  -  (vu) 
vsl  —  adjacent' r (vl)  -  (vl) 
su  -•  vstruct'  1' Vw 
si  —  vstruct' I'vl 

preconditions: 

view(vu)  ai7Jnent  type 

view{vl) 

Barter  (vu,  d)  privilege 

B»tfaer(vl,  d) 

shell (vj,  vl)  Barker  of  spaces 

ocnnaoud'l' (daaoendant'2' (vu))  —  {  )  ooiposltion  uvi foccad 

oorracied'l' (dEja«n(lant'2'(vl))  —  (  ) 

effects: 

s_B»r9e(d,  vu,  su.  eu,  vl,  si,  vsl) 


t>pe 


pr^Gcnd-iicrjs: 
view  (p, 

e;t.'OoL-.l  (p) 

ra:*tipiex(p)  ||  milt  least  (p) 

nercerip,  c) 

nev>'  (n) 
rev*’  (c) 

effects; 

v_create(ci.  n,  p,  s.corconer.ts;!],  c) 

delete  (d,  c) 
let: 

p  corpenent' I'c 

s  vstrjct'  I'c 

:  index' 2' c 

preoonditior^ ; 
view  (c) 

rultiplex(p)  ||  nult least  (p) 
tTerber(c,  d) 


view  structure 


pri\t.legE 


argirent  type 
view  structure 
privilege 


ornected'l' (ciesoenda'!t'2'c)  •  {  ) 


unoomectod 
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Process  Manipulation 

a.'n.-eLe  (cj,  v.-,  vl,  si,  :> 

[sirpie,  enc^:nT.,  U'xr.trcl,  oj*.]] 

(srrpie,  endpcir.i,  [cxntrcl,  ir.jj 

SGJ  =  ftjfxxiie,  erdpoir.i,  [rulticasi,  endpoint,  roc]] 

scl  -=  [fcundle,  endpcir.t,  [rulticast,  en:;point,  rrci}j 

preaondit  ions : 

view{vu)  type 

view{vl) 

tnsrriDer  (vj,  d)  privilege 

rerberfvl,  di 

adjacent' I'vi  — =  {  vl  )  erpty  leaf 

cmnect.ed'1' (cesoeriGant'2'vl)  =  {  } 

if  sj  !*  {  )  view  strjcture 

corplener.taryfsj,  si) 
i  <=  Is-j.arponer.tsi 

new  (cl ) 
new(rL;; 
new(nl) 

effects: 
if  sj  —  {  ) 

p^create (d;,  cl,  vj,  vl,  n’j,  nl)  //  create  sorple  doraLn 
if  yj  !•  {  ) 

new(cu)  //  create  leaf  for  ooM:xoller 

new(cl) 

s_spl:t{Gr-,  cu,  set:,  ivl},  cl,  scl,  {  }) 

let  oe  —  oorponent'2'c-  //  create  rrulticast  view 

newfonj 

vjneate(di:,  1,  mcc,  oe,  orO 

new{xj)  //  create  leaf  for  nerager 

new(xl) 

s_^lit{dj,  X-:,  yj,  (vl,  cu),  xl,  si,  {  ]) 

let  re  —  {  V  :  oorponent  (v,  xj)  ((  index  (i,  v)  )  //  oorvwct  nenager 
let  rns  vstnjct'l'ne 
c_create(re,  cr,; 

new(q;)  //  create  process  in  leaf 

new(yu) 

new(yl) 

p_«seate(di,  dp,  xu,  xl,  yu,  yl) 

d_^lit(dL;,  dl.  yo,  yl,  no,  nl)  //  create  a  oonplex  dcrein 
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k:ll(av,  vjJ 

le*. : 
vl  »= 

:  bcjrtii-ryiv.  cv,  ii  c^'  &i  cs:-.^rc2 ^er  (c)  I 

cl  *=*  bOLTCi'  I'cu 
rl  s-?x;.':cr'2'cc, 
ru  “  bojrd'  1'  rl 
dr  «=“  ntjrcer'  1'  rc 

preoondi  tiers: 

viewtvj)  type 

frmber{vj,  dv!  privtlooe 

new  (ri2) 
new'r.l) 

effects; 

if  below (cu,  vl)  I!  XtVrcscrioelc-:,  vl* 
if  poLici’'l,2'<d'.',  DenTarnv'.t>  —  Abdicre 
d_fTei-T5e(clr,  rc,  r.,  nc,  rl,  cj,  cl' 

if  pclic/ l,2'<c\-,  pcr:ar«r.t>  •«  die 
t_Kiil(cr,  r-,  rl,  rt:,  r.l' 

if  '.(h^lowlcxi,  vli  1!  sarercAoe  (cj,  vl)) 

WilKd.’,  Vl,  vl,  ni,  r,:! 

<iie{dl,  vi; 
let: 

vu  —  boLsnd'i'vl 
si  •"  vstruct'l'vl 
du  —  Bwter'  1 '  Vl 

preconditiora:: 

\a«w(vlj 

superior  (vl,  dl) 
new(rt;) 

r»(r4l) 

effects: 

t_)aU(di,  VC,  vl,  nc,  r,l) 


type 

privile^B 
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Domain  Manipulation 

in'/esi  (d,  ret,  cr.cr-) 
let : 

rob  —  bound'  1 '  ret 
cnor  =  boind'  1 '  cret 

cot  =  <  V  :  bcur«dary(v,  d)  64  DOLnd(v,  c)  66  controller  (c)  ) 
ccc  =  bound'  1 '  cct 

precordi  Lions; 
view;  rot) 
view  (cnob) 

iTHTber(rot,  d) 
nerberiowt,  d) 

clear (rot,  rob) 
clear  (crot,  OTob! 

points_up(rcb) 
points_up  (cnoc) 

!  {be  low  (cot,  rob)  |i  sanespace  (cot,  rob))  d  keeps  old  controller 
below (cnoc,  rob'  dn  gets  new  controller 

adjacent' I'cnob  =  {  cr.oc  .■  wpty 

ccnnected'l'  (descendant' 2' cnccl  «“  (  } 

vstructicnob,  [bundle,  enc^int,  view  structure 

(rrulticast,  endpoint, 

(sinple,  encpcLT.t,  [ca-.trcl,  in]]]) 

new(rnt) 

new{tnb) 

effects: 

d_split{d,  rot,  rob,  rot,  rob,  cr.o.,  cnoc) 

(d,  rot) 

let: 

rcb  —  bound'!' rot 
do  •*- nerrber' I'rob 

cot  —  [  V  :  kjoindari'(v,  do)  66  bound(v,  c)  66  ccntrolier (c)  ) 
cob  —  bound'l'oot 

preoanditions: 
view  (rot) 

inferior  (rot,  d) 

r*w{int) 
newdnb) 

effects: 

if  !iplioe(rot,  rcb)  46  policy' l,?'<do,  ptxneraro 
djrBrge{d,  rot,  rob,  rnt,  rob,  cot,  cob) 

if  iplioe(nx,  rob)  ||  policy' 1.2' <do,  peaiene(X>  die 
t  kllKd,  rot.,  rob) 


argutnerc  type 
privilege,  black  ban 


argurent  type 

privilege 

root  vhite  box 
controller  white  box 
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abcuc<^ie  (s,  rob' 


et : 

rot 

bound'  1'  rrt 

a' 

=“  1' 

co: 

V  ;  Done.-:',  •. 

,  r.  i>  cfl>.i-civ 

,  c)  ii  contrc-ier (c: 

cob 

“  bound'  1 '  cr  t 

preaondi' ions : 

view  (rob)  arrijnen:.  type 

superior  (rot,  c)  privilege,  black  bcil 

new(mt) 

neu(rnb) 

effects: 

if  !3plioe(rot,  nx)  a  pc.icy’ 1, 2' <dc,  perTTanent>  —  abdicate 
djTergB{cin,  rot,  roc,  rr.t,  mr,  cot,  cob) 

if  ^lioe{rot,  roe)  M  pc.icry' 1, 2' <dc,  perrranerO  —  die 
t  kilKct-.,  rot.  ror 


1 


t 


I 
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Spikes 

sz.'.ceiz,  v„l,  v-_r) 


le'. : 

vll  fco.nd'  1'  ‘'.-I 
vrl  “  prespljotx:  (v«.,  v^r 
vr^  —  prespliOBd(\'-r,  v'jli 
iru  ^  intentodia'.eivul,  v^r' 
inr  ir»terrBdiat.e  (vor,  v-li 
sol  —  st-rjC-aiTp' 1'  {vjI  ■ 
sll  ■—  strjcture' 1' '.vll  ■ 
sor  •»  structure';'  wur* 
va  —  adjacent'  1'  {v-ul  j 

CO  —  GciTponert5'2' 'VO- ■’ 
c.t  “  oorponents'^' 1  • 

preocnditions: 

view(vul)  arjjrent  type 

view (vll) 
view (vor) 

pgpfeer  (d.  vol)  privlleoe 

Trertxrid.  vll» 


ccr plerET.tary  ( sc  1 ,  scti  view  structure 

adjacent'!' (vll)  -•  (  vll  *  «Tty  1**^ 

oonneciod'l' (dB9(xndaL-;t'2' (vll))  —  {  ) 


new  (03) 
new()(i) 
niw(xr) 
effects: 

If  ini  in  view 

S_create(vrJ,  vrc,  vol,  vll,  t'l,  irr) 
if  !  ini  in  viev: 

»_spiit{d»,  xl.  sol.  (  I.  xr,  sll.  (  )) 
S_create{iru,  inr,  vol,  vll,  xl,  xr) 

notes: 

creates  itjri  roots 


AS,  Soundness  and  Completeness 
A^.l.  Soundness 

A  full  proof  of  soundness  is  a  simple,  but  quite  laborious  task.  There  are  12  core  operit*oas.  and  about  50 
constraints  on  14  core  relations  that  must  be  preserved.  Many  of  the  600K)dd  individua]  proofs  are  trivial,  but  • 
large  number  require  non-trivial  inference. 

Tbe  proofs  for  connect  and  dbcoBBcct  are  especially  simple.  They  aifeci  only  the  <xrnMt«d  rtUrion.  and  only 
by  adding  or  removing  a  symmetric  pair  of  tuples.  Their  preoonditioiis  immediately  cstaMi^  seven  of  the  nine 
constraints  on  oonnKtwi  Adding  a  symmetric  pair  establishes  the  final  two. 

Actually,  as  presented  in  this  Chapter,  splice  is  not  soured  Tbe  foUowinf  analysis  is  typical  of  if«e  more 
interesting  (dis)proofs.  To  be  uninmisive.  spike  of «  and  b  does  not  rename  b.  After  splicing,  b's  peer  is  in  the 
domain « was  in.  A  view  is  a  member  of  one  domain  during  its  wbek  lifetime,  and  a  view  hi^i  the  same  private  peer 
during  its  whole  lifetime.  Therefore,  b's  peer  must  have  been  in  domain  before  splicing.  All  spaces  of  a  domain 
are  contiguous,  but  b's  peer  was  not  in  the  domain  before  splicing:  contradiction.  One  of  the  intermediate  vicivs  is 


also  crcaic.l  wuh  no  supcnor  views,  in  a  domain  oihcr  ihan  root.  In  ihc  HPC  implemcnniion  ihcsc  unsouridne5!:es 
arc  avoided  by  using  a  more  flexible  (and  complicated)  constraint  on  domain  contiguity. 

A.5.2.  Completeness 

Completeness  up  to  i.somorphism  (trivial  relabelling)  can  be  shown  in  a  more  elegant  fashion.  First,  obstA'C 
that  cver>  structure  can  be  reduced  lO  the  null  structure  by  a  sequence  of  core  operations.  A  further,  non-trivial 
observauon  is  that  each  such  sequence  has  an  inverse  (up  to  isomorphism).  Therefore,  any  true  sentence  can  be 
denved  from  any  other. 

Establishing  inverses  is  non-trivial  for  two  reasons.  Most  core  operations  do  not  have  exact  inverse 
■'perauon  For  cxamp'c,  new  may  create  components  of  remote  views,  while  delete  always  destroys  exactly  one 
view .  Fonunaicly,  cvcr\’  core  operation  has  an  inverse  sequence  of  operations.  A  harder  problem  Is  that  not  every 
sequence  of  core  operauons  can  be  produced  by  a  sequence  of  HPC  primitives,  and  it  is  necessary  L.  demonstrate 
that  the  effects  of  any  primitive  can  be  undone  by  a  sequence  of  primitives. 
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